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PATCHED GENES AND THEIR USE 
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INTRnnTTrTTQI^ 

The field of this invention concerns segment polarity genes and their uses. 
10 Backgmnnri 

Segment polarity genes were discovered in fiies as mutations which change 
the pattern of structures of the body segments. Mutations in the genes cause animals 
to develop the changed patterns on the surfaces of body segments, the changes 
affecting the pattern along the head to tail axis. For example, mutations in the gene 

15 patched cause each body segment to develop without the normal structures in the 
center of each segment. In their stead is a minor image of the pattern normally 
found in the anterior segment. TTius ceHs in the center of the segment make the 
wrong structures, and point them in die wrong direction with reference to the over 
all head-to-tail polarity of the animal. About sixteen genes in the class are known. 

20 The encoded proteins include kinases, transcription factors, a cell junction protein, 
two secreted proteins caUed wingless (WG) and hedgehog (HH), a single 
transmembrane protein called patched (PTC), and some novel proteins not related to 
any known protein. All of these proteins are believed to work together in signaling 
pathways that inform cells about their neighbors in order to set cell fetes and 
25 polarities. 
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Many of the segment polarity proteins of Drosophila and other invertebrates 
are closely related to vertebrate proteins, implying that the molecular mechanisms 
involved are ancient. Among the vertebrate proteins related to the fly genes are En- 
1 and -2, which act in vertebrate brain development and WNT-1, which is also 

5 involved in brain development, but was first found as the oncogene implicated in 
many cases of mouse breast cancer. In flies, the patched gene is transcribed into 
RNA in a complex and dynamic pattern in embryos, including fine transverse stripes 
in each body segment primordium. The encoded protein is predicted to contain 
many transmembrane domains. It has no significant similarity to any other known 

10 protein. Other proteins having large numbers of transmembrane domains include a 
variety of membrane receptors, channels through membranes and transporters 

through membranes. 

The hedgehog (HH) protein of flies has been shown to have at least three 
vertebrate relatives: Sonic hedgehog (Shh); Indian hedgehog, and Desert hedgehog. 
15 The Shh is expressed in a group of cells at the posterior of each developing limb 
bud. This is exactly the same group of cells found to have an important role in 
signaling polarity to the developing limb. The signal appears to be graded, with 
ceUs close to the posterior source of the signal forming posterior digits and other 
limb structures and cells farther from the signal source forming more anterior 
20 structures. It has been known for many years thai transplantation of the signaling 
cells, a region of the limb bud known as the "zone of polarizing activity (ZPA)" has 
dramatic effects on limb patterning. Implanting a second ZPA anterior to the limb 
bud causes a Umb to develop with posterior features replacing the anterior ones (in 
essence UtUe fingers instead of thumbs). Shh has been found to be the long sought 
25 ZPA signal. Cultured ceUs making Shh protein (SHH), when implanted into the 
anterior Umb bud region, have the same effect as an implanted ZPA. This 
establishes that Shh is clearly a critical trigger of posterior limb development. 

The factor in the ZPA has been thought for some time to be related to 
another important developmental signal tiiat polarizes the developing spinal cord. 
30 The notochord. a tod of mesoderm that runs along the dorsal side of early vertebrate 
embryos, is a signal source that polarizes the neural tube along tiie dorsal-vential 
axis. The signal causes tiie part of the neural tube nearest to Uie notochord to form 
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floor plate, a morphologicaily distinct part of the neural tube. The floor plate, in 
turn, sends out signals to the more dorsal parts of the neural tub to further 
determine ceU fetes. The ZPA was reported to have the same signaling effect as the 
notochord when transplanted to be adjacent to the neural tube, suggesting the ZPA 
5 makes the same signal as the notochord. In keeping with this view, Shh was found 
to be produced by notochord cdls and floor plate cells. Tests of extra expression of 
Shh in mice led to the finding of extra expression of floor plate genes in cells which 
would not normally turn them on. Therefore Shh appears to be a component of the 
signal from notochord to floor plate and from floor plate to more dorsal parts of the 
10 neural tube. Besides limb and neural tubes, vertebrate hedgehog genes are also 
expressed in many other tissues including, but not limited to the peripheral nervous 
system, brain, lung, liver, kidney, tooth primordia, genitalia, and hindgut and 
foregut endoderm. 

PTC has been proposed as a receptor for HH protein based on genetic 
15 experiments in flies. A model for the relationship is that PTC acts through a laigely 
unknown pathway to inactivate both its own transcription and the transcription of the 
ynngless segment polarity gene. This model proposes that HH protein, secreted 
fiom adjacent ccUs, binds to the PTC receptor, inactivates it. and thereby prevents 
PTC from turning off its own transcription or that of wingless. A number of 
20 experiments have shown coordinate events between PTC and HH. 
Relevant T ifi»rgtiirp 

Descriptions of patched, by itself or its role with hedgehog may be found in 
Hooper and Scott. CeU 59, 751-765 (1989); Nakano et al.. Nature, 341, 508-513 
(1989) (both of which also describes the sequence for Drosophila patched) Simcox 

25 et al.. Development 107, 715-722 (1989); Hidalgo and Ingham, Development, 110. 
291-301 (1990); Phillips etal.. Development. 110, 105-114 (1990); Sampedro and 
Guerrero. Nature 353, 187-190 (1991); Ingham et al.. Nature 353, 184-187 (1991); 
and Taylor et al.. Mechanisms of Development 42, 89-96 (1993). Discussions of 
the role of hedgehog include Riddle et al.. Cell 75. 1401-1416 (1993); Echelaixl et 

30 al., CeU 75. 1417-1430 (1993); Krauss et al., CeU 75. 1431-1444 (1993); Tabata 
and Romberg. CeU 76. 89-102 (1994); Heemskeric & DiNardo. CfcU 76. 449-460 
(1994); Relink et al.. Cell 76. 761-775 (1994); and a short review article by 
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Ingham, Current Biology 4, 347-350 (1994). The sequence for the Drosophila 5' 
non-coding region was reported to the GenBank, accession number M28418. 
referred to in Hooper and Scott (1989), supra. See also, Forbes, et al.. 
Development 1993 Supplement 115-124. 

5 

Methods for isolating patched genes, particularly mammalian patched genes, 
including the mouse and human parc/,«/ genes, as weU as invertebrate /^orc/^ed genes 
and sequences, are provided. The methods include identification oi patched genes 
10 from other species, as well as members of the same family of proteins. The subject 
genes provide methods for producing the patched protein, where the genes and 
proteins may be used as probes for research, diagnosis, binding of hedgehog protein 
for its isolation and purification, gene therapy, as weU as other utiUties. 



15 



fiRT FF p^'irRTPTTO M OF TF^ drawings 
Fig. 1 is a graph having a restriction map of about lOkbp of the 5' region 
upstream from the initiation codon oi Drosophila patched gene and bar graphs of 
constructs of truncated portions of tiie 5' region joined to P-galactosidase, where Uie 
constructs are introduced into fly ceU lines for ttie production of embryos. The 
20 expression of P-gal in the embryos is indicated in the right-hand table during early 
and late development of Ute embryo. The greater Uie number of +'s. the more 
intense Uie staining. 



25 



30 



pT^^j ^PTPTTON OF THF SPF^"^*"^ FMBODTMENTS 
Metiiods are provided for identifying members of iia&patched iptc) gene 
family from invertebrate and vertebrate, e.g. mammalian, species, as well as ttie 
entire cDNA sequence of the mouse and human patched gene. Also, sequences for 
invertebrate patched genes are provided. 'T\i^patched gene encodes a 
transmembrane protein having a large number of transmembrane sequences. 

In identifying the mouse and human patched genes, primers were employed 
to move through the evolutionaiy tree from the known Drosophila ptc sequence. 
Two primers are employed from the Drosoptula sequence witii appropriate 
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restriction enzyme linters to amplify portions of genomic DNA of a related 
invertebrate, such as mosquito. The sequences are selected from regions which are 
not likely to diverge over evolutionary time and are of low degeneracy. 
Convenientiy, Hit regions are the N-terminal proximal sequence, generally within 
5 the first 1 .5kb, usually within the first Ikb, of the coding portion of Uie cDNA, 
convenientiy in tiie first hydrophilic loop of Uie protein. Employing tiie polymerase 
chain reaction (PGR) with the primers, a band can be obtained from mosquito 
genomic DNA. The band may tiien be amplified and used in turn as a probe. One 
may use tiiis probe to probe a cDNA library from an organism in a different branch 
10 of die evolutionary tree, such as a butterfly. By screening tiie Ubrary and 
identifying sequences which hybridize to tiie probe, a portion of Uie butterfly 
patched gene may be obtained. One or more of die resulting clones may tiien be 
used to rescreen tiie library to obtain an extended sequence, up to and including Uie 
entire coding region, as well as tiie non-coding 5'- and 3'-sequences. As 
15 appropriate, one may sequence aU or a portion of Uie resulting cDNA coding 
sequence. 

One may tiien screen a genomic or cDNA library of a species higher in Uie 
evolutionary scale witii appropriate probes from one or boUi of Uie prior sequences. 
Of particular interest is screening a genomic Ubrary, of a distantiy related 

20 invertebrate, e.g. beetie, where one may use a combination of Uie sequences 
obtained from Uie previous two species, in Uiis case, Uie Dmsophila and Uie 
butterfly. By appnq)riate techniques, one may identify specific clones which bind to 
Uie probes, which may ttien be screened for cross hybridization wiUi each of Uie 
probes individuaUy. The resulting fragments may Uien be ampUfied, e.g. by 

25 subcloning. 

By having all or parts of Uie 4 different patched genes, in Uie presenUy 
illusttated example, Dmsophila (fly), mosquito, butterfly and beette, one can now 
compare patched genes for conserved sequences. CeUs from an appropriate 
mammalian Umb bud or oUier ceUs expressing patched, such as notochord, neural 
30 tube, gut, lung buds, or oUier tissue, particularly fetal tissue, may be employed for 
screening. Alternatively, adult tissue which produces patched may be employed for 
screening. Based on Uie consensus sequence available from Uie 4 oUier species, one 

5 
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can develop probes where at each site at least 2 of the sequences have the same 
nucleotide and where the site varies that each species has a unique nucleotide, 
inosine may be used, which binds to all 4 nucleotides. 

Hther PGR may be employed using primers or, if desired, a genomic library 
5 from an appropriate source may be probed. With PGR, one may use a cDNA 
Ubrary or use reverse transcriptase-PCR (RT-PCR), where mRNA is available from 
the tissue. UsuaUy, where fetal tissue is employed, one wUl employ tissue from the 
first or second trimester, preferably the latter half of the first trimester or the second 
trimester, depending upon the particular host. The age and source of tissue will 
10 depend to a significant degree on the abUity to surgically isolate the tissue based on 
its size, the level of expression of patched in the cells of the tissue, die accessibility 
of the tissue, tiie number of cells expressing patched and the like. The amount of 
tissue avaUable should be large enough so as to provide for a sufficient amount of 
mRNA to be usefijlly transcribed and ampUfied. With mouse tissue, Umb bud of 
15 from about 10 to 15 dpc (days post conception) may be employed. 

In the primers, the complementary binding sequence will usually be at least 
14 nucleotides, preferably at least about 17 nucleotides and usually not more than 
about 30 nucleotides. The primers may also include a restriction enzyme sequence 
for isolation and cloning. With RT-PCR, the mRNA may be enriched in accordance 
20 witii known ways, reverse transcribed, followed by amplification with the 

appropriate primers. (Procedures employed for molecular cloning may be found in 
Molecular Cloning: A Laboratory Manual, Sarabrook et al., eds.. Cold Spring 
Hartwr Laboratories, Cold Spring Harbor, NY, 1988). Particulariy, the primers may 
convenientiy come from the N-tenninal proximal sequence or other conserved 
25 region, such as those sequences where at least five amino acids are conserved out of 
eight amino adds in tiiree of tiie four sequences. This is illustrated by the sequences 
(SEQ ID NO:ll) IITPLDCFWEG. (SEQ ID N0:12) LIVGG, and (SEQ ID NO:13) 
PFFWEQY. Resulting PGR products of expected size are subcloned and may be 
sequenced if desired. 

30 The cloned PGR ftagment may then be used as a probe to screen a cDNA 

Ubrary of mammalian tissue ceUs expressing patched, where hybridizing clones may 
be isolated under appropriate conditions of stringency. Again, the cDNA Ubrary 



wo 96/1 1260 



PCT/US95/13233 



should come from tissue which expresses patched, which tissue will come within the 
limitations previously described. Clones which hybridize may be subcloned and 
rescreened. The hybridizing subclones may then be isolated and sequenced or may 
be further analyzed by employing RNA blots and in situ hybridizations in whole and 
5 sectioned embryos. Conveniently, a fragment of from about 0.5 to Ikbp of the N- 
terminal coding region may be employed for the Northern blot. 

The mammalian gene may be sequenced and as described above, conserved 
regions identified and used as primers for investigating other species. The N- 
terminal proximal region, the C-terminal region or an intermediate region may be 
10 employed for the sequences, where the sequences will be selected having minimum 
degeneracy and the desired level of conservation over the probe sequence. 

The DNA sequence encoding PTC may be cDNA or genomic DNA or 
fragment thereof, particularly complete exons from the genomic DNA, may be 
isolated as the sequence substantially free of wild-type sequence from the 
15 chromosome, may be a 50 kbp fragment or smaller fragment, may be joined to 
heterologous or foreign DNA, which may be a single nucleotide, oligonucleotide of 
up to 50 bp, which may be a restriction site or other identifying DNA for use as a 
primer, probe or the like, or a nucleic acid of greater than 50 bp, where the nucleic 
acid may be a portion of a cloning or expression vector, comprise the regulatory 
20 regions of an expression cassette, or tiie like. The DNA may be isolated, purified 
being substantially free of proteins and other nucleic acids, be in solution, or the 
like. 

The subject gene may be employed for producing all or portions of the 
patched protein. The subject gene or fragment thereof, generally a fragment of at 

25 least 12 bp, usually at least 18 bp, may be introduced into an appropriate vector for 
extrachromosomal maintenance or for integration into the host. Fragments will 
usually be immediately joined at the 5* and/or 3* terminus to a nucleotide or 
sequence not found in die natural or wild-type gene, or joined to a label other than a 
nucleic acid sequence. For expression, an expression cassette may be employed, 

30 providing for a transcriptional and translational initiation region, which may be 
inducible or constitutive, the coding region under the transcriptional control of the 
transcriptional initiation region, and a transcriptional and translational termination 
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region. Various transcriptional initiation regions may be employed which are 
functional in the expression host. The peptide may be expressed in prokaryotes or 
eukaryotes in accordance witii conventional ways, depending upon the purpose for 
expression. For large production of tiie protein, a uniceUular organism or cells of a 
5 higher organism, e.g. eukaryotes such as vertebrates, particularly mammals, may be 
used as tiie expression host, such as E. coU. B, subtilis. S. cerevisiae, and Uie like. 
In many sittiations, it may be desirable to express the patched gene in a mammalian 
host, whereby the patched gene will be transported to the ceUular membrane for 
various studies. The protein has two parts which provide for a total of six 
10 transmembrane regions, with a total of six extraceUular loops, titree for each part. 
The character of the protein has similarity to a transporter protein. The protein has 
two conserved glycosylation signal triads. 

The subject nucleic acid sequences may be modified for a number of 
purposes, particularly where tiiey will be used intracellularly, for example, by being 
15 joined to a nucleic add cleaving agent, e.g. a chelated metal ion, such as iron or 
chromium for cleavage of the gene; as an antisense sequence; or the like. 
Modifications may include replacing oxygen of the phosphate esters witii sulfur or 
nitrogen, replacing the phosphate with phosphoramide, etc. 

WiUi tiie availability of the protein in large amounts by employing an 
20 expression host, the protein may be isolated and purified in accordance witii 

conventional ways. A lysate may be prepared of tiie expression host and tiie lysate 
purified using HPLC, exclusion chromatography, gel electrophoresis, affmity 
chromatography, or other purification technique. The purified protein wiU generally 
be at least about 80% pure, preferably at least about 90% pure, and may be up to 
25 100% pure. By pure is intended free of otiier proteins, as well as cellular debris. 

The polypeptide may be used for tiie production of antibodies, where short 
fragments provide for antibodies specific for tiie particular polypeptide, whereas 
larger fragments or tiie entire gene allow for tiie production of antibodies over tiie 
surface of tiie polypeptide or protein, where tiie protein may be in its natural 

30 conformation. 

Antibodies may be prepared in accordance witii conventional ways, where 
tiie expressed polypeptide or protein may be used as an immunogen, by itself or 
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conjugated to known immunoeenic cam>rc ^ ^ vr « 

eanc earners, KLH, pre.s HBsAg, other viral or 
pn«cu,s, or me Bte. vaHou. adj„™,« „„, ^ , 

Of .njeeuons. « ^™pHa.e. For .on„ci«,al aadbodies, after one or booker 
u,ec«o„.. fl,e splee,, nay be isobW. U„ spie^Ky^s tamoralized. a«l to 
5 seined for high affinity antibody binding. Tl,e i™™,rtali«i cells <r , 

bybtidomas, producing the ded«, ^..ibodies ™ay then be expanded', tehcr 
*«nption. see Momxional Antibodies: A Uborato,, Manual. Harlow and lane 
eds.. Cold Spring Hartor Uboratories, Cold Spring Harbor. New York, 1988 If 
■'«i™'-'^'"RNA encoding the heavy and light Chains nay be isolated »,d 

IT"^ " '"^^ " "^-^ - "8- ^ains n«y be n^^l to 

ftother enhance «,e affinity of d,e andbody. T»e antibodies may find use in 
d-agnosdc assays for detecdon o, d,e ptesence of the PTC protein on the surface of 
cells or to tahibit the bansducdon of signal by the PTC ptotein ligand by competing 
for the binding site. 

15 ■^"»''«^"<=*«'8ene(SEQn)NO:09)e„codesap,otein(SEQID 

NO: 0, Which has about 38, identical aniino acids to fiy WC (SEQ ,D N0:6, over 
.^«...200.n.„oacids. IWs anu,un.ofconservaUon is dispersed through ntuch 
ofthepn,teu,e,c^a„C^^^^. UenouseptoteinalsohasaSO 
atnnH, acid insert relativ. to the fly p^ein. 11,0 hunun p^cA«, gene (SEQ n, 

NOaS) contains an open .eading ftan, Of about 1450 amino adds (SEQ ro NO-,9) 
tot is about 96* identical (98 % similar) to mous.;»c (SEQ ID NO 09) The ' 
hutnan pa.cHe4 gene (SEQ m NO: 18). including codmg and «,„^g 

«ab„ut89» identical lodKmouse;»,rt«/gene(SEQIDNO 09) ' 
Tlte butterfly PTC homolog (SEQ ID N0:4) is 1,300 amino adds long and 
25 o«nH has a 50« amino add identity (72% similarity, to fly RTC (SEQ ID N0:6, 
Wtthtt^exceptionofadivetgentC^us, dUs homology is evenly sp,^ ac,„« 
«» codmg sequotc A 2«7bp ^ fron, the beetle;^di«, gene encodes an 89 

acid protdn ftagment which was fou^l to be 44« and 51 55 identical to d« 
conesponding regions of fly and buderfly PTC respectively 

30 "'""^''^"^sageisabou.Slblongandthemessag.isp.eseminlow 
levels as early as 7 dpc. fl» abundancy inceasing by . 1 aM ,5 dpc. Nortttem blot 

md.ca.esacleardec.easei„«,eamo™,tofmessag.a.l7dpc. Ind»,dul. FTC 
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i„ *e kidney ana Uv«. sl,n.s ^ ^ » 

TTiuscle and testes. . 

„»b^os.p,c.RNAUp.«en.a.7.pc,»s,ns...- 

^ne of *e c».«l n«vo«s sys»m » * „rtchon<lriun, co^ensing 

, , is nresent in a wide range of tissues from endodermal. 

rr::wr:rLo..n,..^..*-— ..n.^ 

• «f limbs the differentiation of lung tissue, and the generauon 
^^Z^ of -n^cs.. ^coia., pHn»«. .ore panicuU,., .un^. or 

3,3„da»i «iU> *e «ais u> deKrmin. dK «lauonsh,p w,* fte leve. 
co™prtsing*e5'n™,-codin| region. If necessary, on rnay^aT.* «ag 
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Obtain further 5' sequence to ensure that one has at least a functional portion of the 
enhancer. It is found that the enhancer is proximal to the 5* coding region, a 
portion being in the transcribed sequence and downstream from the promoter 
sequences. The transcriptional initiation region may be used for many purposes, 
5 studying embryonic development, providing for regulated expression of patched 
protein or other protein of interest during embryonic development or thereafter, and 
in gene ther^y. 

The gene may also be used for gene therapy, by transfection of the normal 
gene into embryonic stem cells or into mature cells. A wide variety of viral vectors 
10 can be employed for transfection and stable integration of the gene into the genome 
of the ceUs. Alternatively, micro-injection may be employed, fusion, or the Uke for 
introduction of genes into a suitable host cell. See, for example, Dhawan et al. , 
Science 254, 1509-1512 (1991) and Smith et al. , Molecular and Cellular Biology 
(1990) 3268-3271. 

15 By providing for the production of large amounts of PTC protein, one can 

use the protein for identifying ligands which bind to the PTC protein. Particularly, 
one may produce the protein in cells and employ the polysomes in columns for 
isolating ligands for the PTC protein. One may incorporate the PTC protein into 
liposomes by combining the protein with appropriate lipid surfactants, e.g. 
20 phospholipids, cholesterol, etc., and sonicate the mixture of die PTC protein and die 
surfactants in an aqueous medium. Witii one or more established ligands, e.g. 
hedgehog, one may use the PTC protein to screen for antagonists which inhibit die 
binding of the Ugand. In this way. drugs may be identified which can prevent die 
transduction of signals by die PTC protein in normal or abnormal ceUs. 
25 The PTC protein, particularly binding fragments Uiereof, die gene encoding 

die protein, or fragments diereof, particularly fragments of at least about 18 
nucleotides, frequentiy of at least about 30 nucleotides and up to die entire gene, 
more particulariy sequences associated witii die hydrophilic loops, may be employed 
in a wide variety of assays. In ttiese sitiiations, die particular molecules will 
30 normaUy be joined to anodier molecule, serving as a label, where die label can 
direcdy or indirecdy provide a detectable signal. Various labels include 
radioisotopes, fluorescers, chenuluminescers, enzymes, specific binding molecules. 
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particles, e.g. magnetic particles, and the Uke. Specific binding molecules include 
pairs, such as biotin and streptavidin, digoxin and antidigoxin etc. For the specific 
binding members, the complementary member would normally be labeled with a 
molecule which provides for detection, in accordance with known procedures. The 
5 assays may be used for detecting the presence of molecules which bind to the 

patched gene or PTC protein, in isolating molecules which bind to Mic patched gene, 
for measuring the amount of patched, either as the protein or the message, for 
identifying molecules which may serve as agonists or antagonists, or the like. 

Various formats may be used in the assays. For example, mammalian or 
10 invertebrate cells may be designed where the cells respond when an agonist binds to 
PTC in the membrane of the cell. An expression cassette may be introduced into 
the cell, where the transcriptional initiation region of patched is joined to a marker 
gene, such as p-galactosidase, for which a substrate forming a blue dye is available. 
A 1.5kb fragment that responds to PTC signaling has been identified and shown to 
15 regulate expression of a heterologous gene during embryonic development. When 
an agonist binds to the PTC protein, the cell will turn blue. By employing a 
competition between an agonist and a compound of interest, absence of blue color 
formation wUl indicate the presence of an antagonist. TTiese assays are well known 
in Uie literature. Instead of cells, one may use the protein in a membrane 
20 environment and determine binding affinities of compounds. The PTC may be 
bound to a surface and a labeled Ugand for PTC employed. A number of labels 
have been indicated previously. The candidate compound is added with the labeled 
ligand in an appropriate buffered medium to the surface bound PTC. After an 
incubation to ensure that binding has occurred, the surface may be washed free of 
25 any non-specifically bound components of the assay medium, particularly any non- 
specifically bound labeled Ugand, and any label bound to tiie surface determined. 
Where the label is an enzyme, substrate producing a detectable product may be used. 
The label may be detected and measured. By using standards, the binding affinity of 
the candidate compound may be determined. 
30 The availability of the gene and the protein allows for investigation of the 

development of the fetus and the mis patched and other molecules play in such 
development. By employing antisense sequences of the pafcteJ gene, where the 
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sequences may be intioduced in ceUs in culture, or a vector providing for 
transcription of the antisense of the patched gene introduced into the cells, one can 
investigate the role the PTC protein plays in the cellular development. By providing 
for the PTC protein or fragment thereof in a soluble form which can compete with 
5 the normal cellular PTC protein for ligand, one can inhibit the binding of Ugands to 
the cellular PTC protein to see the effect of variation in concentration of ligands for 
the PTC protein on the cellular development of the host. Antibodies against FTC 
can also be used to block function, since PTC is exposed on the cell surface. 
The subject gene may also be used for preparing transgenic laboratory 
10 animals, which may serve to investigate embryonic development and the role the 
PTC protein plays in such development. By providing for variation in the 
expression of the PTC protein, employing different transcriptional initiation regions 
which may be constitutive or inducible, one can determine the developmental effect 
of the differences in PTC protein levels. Alternatively, one can use the DNA to 
15 knock out the PTC protein in embryonic stem cells, so as to produce hosts with only 
a single functional patched gene or where the host lacks a functional patched gene. 
By employing homologous recombination, one can introduce a patched gene, which 
is differentially regulated, for example, is expressed to the development of the fetus, 
but not in the adult. One may also provide for expression of the patched gene in 
20 ceUs or tissues where it is not normally expressed or at abnormal times of 
development. One may provide for mis-expression or faUure of expression in 
certain tissue to mimic a human disease. Thus, mouse models of spina bifida or 
abnormal motor neuron differentiation in the developing spinal coid are made 
available. In addition, by providing expression of PTC protein in cells in which it is 
25 otherwise not normally produced, one can induce changes in ceU behavior upon 
binding of ligand to the PTC protein. 

Areas of investigation may include the development of cancer treatments. 
The mngless gene, whose transcription is regulated in fKes by FTC, is closely 
related to a mammalian oncogene, Wm-1, a key factor in many cases of mouse 
30 breast cancer. Other Wnt fiunUy members, which are secreted signaling proteins, 
are impUcated in many aspects f development. In flies, the signaling factor 
decapemaplegic, a member of the TGF-beta femUy of signaling proteins, known to 
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affect growth and development in mammals, is also controUed by PTC. Since 
members of both the TGF-beta and Wnt famiUes are expressed in mice in places 
close to overlapping with paichcd, the common regulation provides an opportumty 
in treating cancer. Also, for repair and regeneration. proUferation competent cells 
5 making FTC protein can find use to promote regeneration and healing for damaged 
tissue, which tissue may be regenerated by transfecting ceUs of damaged tissue mth 
theprc gene and its normal transcription initiation region or a modified transcnption 
initiation region. For example. PTC may be useful to stimuUte growth of new teeth 
by engineering ceUs of the gums or other tissues where PTC protein was dunng an 
10 earUer developmental stage or is expressed. 

Since Northern blot analysis indicates that ptc is present at high levels in 
adult lung tissue, the regulation of pre expression or binding to its natural Ugand 
niay serve to inhibit proliferation of cancerous lung cells, m availabiUty of the 
gene encoding FTC and the expression of the gene allows for the development of 
15 agonists and antagonists. In addition. PTC is central to ti»e ability of neurons to 
differentiate early in development. The avaUability of the gene allows for the 
introduction of FTC into host diseased tissue, stimulating the fetal program of 
division and/or differentiation. This could be done in conjunction with other genes 
which provide for tiie Ugands which regulate PTC activity or by providing for 
20 agonists otijer than the natural Ugand. 

The avaUability of the coding region for various ptc genes from various 
species allows for the isolation of tiie 5' non^ing region comprising the promoter 
and en^cer associated with tiieprc genes, so as to provide transcriptional and post- 
transcriptional regulation of tiie ptc gene or other genes, which aUow for regulation 
25 ofgenesinrelationtotiieregulationofthep/cgene. Since tiie pre gene is 

autoregulated. activation of tiie ptc gene will result in activation of transcription of a 
gene under tiie transcriptional control of the transcriptional initiation region of tiie 
ptc gene. The transcriptional initiation region may be obtained from any host 
species and introduced into a heterologous host species, where such initiation region 
30 is functional to tiie desired degree in tiie foreign host. For example, a fragment of 
from about 1.5 kb upstream from tiie initiation codon, up to about lOkb. preferably 
up to about 5 kb may be used to provide for transcriptional initiation regulated by 

14 



r 



wo 96/11260 

PCT/US95/13233 

the PTC protein, particularly th^ Dn,sopfula S'-non-coding region (GenBank 
accession no. M28418). 



The foUowing examples are offered by illustration not by way of limitation. 
Methnrf^ and Mntfrinli 

^' PCR on Mn^niiito (ArtnnMes Pnmhi^^\ n^nnmir rm - 

PGR primers were based on amino acid stretches of fly PTC that were not 
bkely to diverge over evolutionary time and were of low degeneracy. Two such 
pnmers (P2R1 (SEQ ID N0:14): fiGACGAATECAARGTNCAYCARYTNTGG 
P4R1: (SEQ ID NO:15) fifiACflAAIICCYTCCCARAARCANTC (the 
underlined sequences are Eo, RI linkers) amplified an appropriately sized band from 
mosquito genomic DNA using the PGR. The program conditions were as follows- 
15 94''G4min.;72»GAddTaq: 

[49 »G 30 sec.; 72 'C 90 sec.; 94 "G 15 sec] 3 times 
[94 -G 15 sec.; 50 'C 30 sec.; 72 'C 90 sec] 35 times 
72 'C 10 min; 4 "C hold 

20 the USB Sequence Idt. 



10 



Ustai fte mosquto PCR product (SEQ ID N0:7) a a probe, a 3 day 
embryonic /^ir coenia Agt.o cDNA Ubi»y (generously p,„vid«l by Sean 

25 Carroll) wa. screened. FUters hybridiz^l at « 'C ovemigh. in a soluti™, 
containing 5xSSC, 10% d«t™ sulfate, 5. Denhardfs. 200 .g/ml sonicated 
«Imon spem. DNA, and 0.5« SDS. Filters were washed in O.IX SSC, 0.1 % SDS 
at room temperature several dmes to remove nonspecific hybridiaion. Of the 
100.000 plaques initially sc«,ed, 2 overlapping clones, LI and U, were isolated 

» wluchcotrespond*. to a,eN.enninus of butterfly PTC. Using U as a p,*., the ' 
Itbrary filtm «^ rescn«»d and 3 additional clones (15, L7, U) wete is„,««| 

Which encompassed the remainder of thejvccoding sequence. TOe fuUlength 



15 



I 



PCTAJS9S/13233 

WO 96/11260 



sequence 
sequencing 



of butterfly ptc (SEQ ID N0:3) was determined by ABI automated 



m. 

5 Q ffl hp Fr^ cmi*!^^ ^^'^ RnttffrflY Clone 

A Xgemll genomic library from TriboUum casteneum (gift of Rob Dennell) 
was probed with a mixture of the mosquito PGR (SEQ ID NO:7) product and 
BstXI/EcoRI fragment of U. FUters were hybridized at 55 'C overnight and 
washed as above. Of the 75,000 plaques screened. 14 clones were identified and the 

10 Sad fragment of T8 (SEQ ID NO:l). which crosshybridized with the mosquito and 
butterfly probes, was subcloned into pBluescript. 



IV. 

rnnirrvH ^" P""'' ^^^^ Homfilneues 
15 Two degenerate PGR primers (P4REV: (SEQ ID N0:16) 

QQ^^CGAAirCYTNGANTGYTrYTGGGA; P22: (SEQ ID NO: 17) 
^.^T^ r^r^ ftfi ^rAAfirrrGT CIGGCCARTGCAT) were designed based on a 
comparison of FTC amino acid sequences from fly (Drosophila melanogaster) (SEQ 
ID NO:6). mosquito {Anopheles gambiae)(.SEQ ID N0:8). butterfly (Precis 
20 co£mfl)(SEQIDNO:4),andbeeUe(7ri6o/i«mcasrenewn)(SEQIDNO:2). I 

represents inosine, which can form base pairs with all four nucleotides. P22 was 
used to reverse transcribe RNA from 12.5 dpc mouse Umb bud (gift from David 
Kingsley) for 90 min at 37 "C. PGR using P4REV(SEQ ID NO:17) and P22(SEQ 
ID NO: 18) was then performed on 1 mI of the resultant cDNA under the following 

25 conditions: 

94 X 4 min.; 72 'C Add Taq; 
[94 "C 15 sec.; 50 'C 30 sec.; 72 "G 90 sec.] 35 times 
72 "C 10 min.; 4 °C hold 
PGR products of the expected size were subcloned into the TA vector anvitiogen) 
30 and sequenced with the Sequenase Version 2.0 DNA Sequencing Kit (U.S.B.). 

Using the cloned mouse PGR fragmem as a probe. 300.000 plaques of a 
mouse 8.5 dpc AgtlO cDNA Ubrary (a gift from Brigid H gan) were screened at 
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65'C as above and washed in 2x SSC. 0.1 % SDS at rxwm temperature. 7 clones 
were isolated, and three (M2 M4. and M8) were subcloned into pBIuescript II. 
200,000 plaques of this library were rescreened using fint, a 1. 1 kb EcoRI fragment 
from M2 to identify 6 clones (M9-M16) and secondly a mixed probe containing the 
5 most N terminal (Xhol fragment from M2) and most C terminal sequences 

(BamHI/Bgin fragment from M9) to isolate 5 clones (M17-M21). M9, MIO, M14, 
and M17-21 were subcloned into the EcoRI site of pBIuescript n (Strategene). 

^- RNA Blntfi And in sitii Hvhridirarion. in whni. n^rt S f Ttinn»H M fmhm '^ 

10 Northerns: 

A mouse embryonic Northern blot and an adult multiple tissue Northern blot 
(obtained from Clontech) were probed with a 900 bp EcoRI fragment from an N 
terminal coding region of mouse ;;rc. Hybridization was performed at 65 "C in 5x 
SSPE, lOx Denhardt's, 100 /zg/ml sonicated salmon sperm DNA, and 2% SDS. 
15 After several short room temperature washes in 2x SSC, 0.05% SDS, the blots were 
washed at high stringency in 0. IX SSC, 0. 1 % SDS at 50C. 
In situ hybridization of sections: 

7.75, 8.5, 11.5, and 13.5 dpc mouse embryos were dissected in PBS and 
frozen in Tissue-Tek medium at -80 "C. 12-16 ^m frozen sections were cut, 
20 coUected onto VectaBond (Vector Laboratories) coated slides, and dried for'so-eo 
minutes at room temperature. After a 10 minute fixation in 4% paraformaldehyde in 
PBS, the slides were washed 3 times for 3 minutes in PBS, acetylated for 10 minutes 
in 0.25% acetic anhydride in triethanolamine. and washed three more times for 5 
minutes in PBS. Prehybridization (50% forraamide. 5X SSC. 250 ^g/ml yeast 
25 tRNA, 500 ng/ml sonicated salmon sperm DNA, and 5x Denhaidfs) was carried 
out for 6 hours at room temperature in 50% fbrmamide/5x SSC humidified 
chambers. Tht probe, which consisted of 1 kb from the N-terminus of ptc, was 
added at a concentration of 200-1000 ng/ml into the same solution used for 
prehybridization. and then denatured for five minutes at 80 "C. Approximately 75 
30 ^1 of probe were added to each sUde and covered with ParafUm. TTie slides were 
incubated overnight at 65 'C in the same humidified chamber used previously. Tht 
following day. the probe was washed successively in 5X SSC (5 minutes, 65 'Q, 
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0.2X SSC (1 hour, 65 "C), and 0.2X SSC (10 minutes, room temperature). After 
five minutes in buffer Bl (O.IM maleic acid, 0.15 M NaCl, pH 7.5), the slides were 
blocked for 1 hour at room temperature in 1 % blocking reagent (Boerhinger- 
Mannheim) in buffer Bl, and then incubated for 4 hours in buffer Bl containing the 

5 DIG-AP conjugated antibody (Boerhinger-Mannheim) at a 1:5000 dilution. Excess 
antibody was removed during two 15 minute washes in buffer Bl, followed by five 
minutes in buffer B3 (100 mM Tris, lOOmM NaCl, 5mM MgCl,, pH 9.5). The 
antibody was detected by adding an alkaline phosphatase substrate (350 /il 75 mg/ml 
X-phosphate in DMF, 450 fil 50 mg/ml NBT in 70% DMF in 100 mis of buffer B3) 

10 and allowing the reaction to proceed over-night in the dark. After a brief rinse in 10 
mM Tris, ImM EDTA, pH 8.0, the slides were mounted with Aquamount (Lemer 
Laboratories). 

VI. nrnKnphaa 5 -tTanscriprional initiation reyion 6-gal COnStfUCtS- 

15 A series of constructs were designed that link different regions of the ptc 

promoter from Drosophila to a LacZ reporter gene in order to study the cis 
regulation of the p/c expression pattern. See Fig. 1. A lO.Skb BamHI/BspMl 
fragment comprising the 5'-non-coding region of the mRNA at its 3*-terminus was 
obtained and truncated by restriction enzyme digestion as shown in Fig. 1. These 

20 expression cassettes were introduced into Drosophila lines using a P-element vector 
(Thummel et al.. Gene 74, 445-456 (1988), which were injected into embryos, 
providing flies which could be grown to produce embryos. (See Spradling and 
Rubin, Science (1982) 218, 341-347 for a description of the procedure.) The vector 
used a pUC8 background into which was introduced the white gene to provide for 

25 yellow eyes, portions of the P-element for integrtion, and the constructs were 
inserted into a polylinker upstream from the LacZ gene. The resulting embryos 
were stained using antibodies to LacZ protein conjugated to HRP and the embryos 
developed with OPD dye to identify the expression of the LacZ gene. Tlie staining 
pattern is described in Fig. 1, indicating whether there was staining during the early 

30 and late development of the embiyo. 

Vn. Tcolatinn nf a Mouse tttc Gene 
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Homologues of fly PTC (SEQ ID N0:6) were isolated from three insects- 
mosquito, butterfly and beeUe, using either PGR or low stringency library screens 
PGR primers to six amino acid stretches of PTC flow mutatability and degeneracy 
were designed. One primer pair. P2 and P4. amplified an homologous fragment of 
5 ptc from mosquito genomic DNA that corresponded to the first hydrophilic loop of 
the protein. TTte 345bp PGR pnxluct (SEQ ID NO:7) was subcloned and sequenced 
and when aUgned to fly PTG, showed 67% amino acid identity. 

The cloned mosquito fragment was used to screen a butterfly ACT 10 cDNA 
Ubrary. Of 100.000 plaques screened, five overlapping clones were isolated and 
10 used to obtain the m length coding sequence. The butterfly PTC homologue (SEQ 
ID N0:4) is 1.311 amino acids long and oveiBll has 50% amino acid identity (72% 
similarity) to fly FTC. With the exaction of a divergent C-terminus. this homology 
IS evenly spread across the coding sequence. The mosquito PGR clone (SEQ ID 
N0:7) and a corresponding fragment of butterfly cDNA were used to screen a beeUe 
15 Ageml 1 genomic library. Of the plaques screened, 14 clones were identified A 
fragment of one clone (T8). which hybridized with the original probes, was 
subcloned and sequenced. This 3kb piece contains an 89 amino acid exon (SEQ ID 
N0:2) which is 44% and 51 % identical to the corresponding regions of fly and 
butterfly PTC respectively. 

20 Using an alignment of the four insect homologues in the first hydrophUic loop 

of the PTG, two PGR primers were designed to a five and six amino acid stretch 
which were identical and of low degeneracy. TTiese primers were used to isolate the 
mouse homologue using RT-PGR on embryonic limb bud RNA. An appropriately 
sized band was amplified and upon cloning and sequencing, it was found to encode a 

25 protein 65% identical to fly PTG. Using the cloned PCR prtxluct and subsequently 

Pigments of mouse;,rc cDNA. a mouse embryonic AcDNA Kbrary was screened 

Fix>m about 300.000 plaques. 17 clones were identified and of these, 7 form 

overlapping cDNA's which comprise most of the protein-coding sequence (SEQ ID 
N0;9). 

30 



vna. 
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«oosoie does not reveal any addWonaJ minor bands. 
8kb message. F„rU,« ^ 

Developmentally, ptc mRNA is preseni in . . , ,t the 

,„iJunda..byUand.3apc. *e se.« -UU P«sen. aU, *.^e 

, Ko.hen,«otina>ea«sao.ea,deere3se>„U«amou„.of"^--^^^^ 
H H u »«: RNA is present in higl. amounts in me bnun and lung, as weu 
rrr—Ie^aneyandU^t. Wea. .gnals are de^ . .ea„. 

spleen, skeletal muscle, and testes. 




^aetec.ab,esi^ins.«ionsfrom7..5dpcemb,,os. ^^^^ 

e^piainedbytbelo^levelor^scnption. — 
along the neural axis of 8.5 dpc embryos. By 11.5 dpc.p<c can 

. nop.glungbudsandgu..consis.eat..bi.sadul.Wo„«^^^ 
^didon, .he gene is present at high le-ls in the ventncular »n. o «.e « 

as weU as in dte a.na limitans of fte prosencephalon, pic ts also 
nervous system, as weu as in m , „f , 1 5 and 13 5 dpc Umb buds, as 

strongly transcribed in the condensing camlage of 1 1.5 and 1.5^ 
^asintheventralpor.ionof,hesomites,aregtonwh.chip™^«^ 

,0 sclero«,meandeven.uallyfonnsboneind,evertebralcolumn.prc^presen, 
:!rlgeoftiss»esfrome„doaermal.m.sodermalandectodermalong.n 

its fundamental role in embryonic development. 

^ ""-.oisoutCl^^^Wx^ 

«a<^.werescreen...a.,tbpmo^^c.«^^^^^^^^^ 

Kohrirfized overnight at reduced stnngency (60 Cm3A3;> , 

Filters were hybnoizea ovenugiu a maA and 0 5% 

7 .„ J 5XDenha.dfs.0.2mg/ml sonicated salmon spermDNA,and0.5* 

r riXl (H. a. H.) were isoUted, the .serts clo^. m» 

Lseptchomolog. •^"'"•-•^'•"^•"'^:"r':;lCr.ntra„sla,ed 
^ i„ dupUcate with M2-3 EcoR I and M2-3 Xho I (contannng 
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sequence of mouse /jfc) probes. Ten plaques were purified and of these, 6 inserts 
were subloned into pBluescript. To obtain the full coding sequence, H2 was fully 
and H14, H20, and H21 were partially sequenced. The 5. Ikbp of human /j/c 
sequence (SEQ ID NO:18) contains an open reading frame of 1447 amino acids 
5 (SEQ ID NO:19) that is 96% identical and 98% similar to mouse p/c. The 5' and 3' 
untranslated sequences of human ptc (SEQ ID NO: 18) are also highly similar to 
mouse /»fc (SEQ ID NO:09) suggesting conserved regulatory sequence. 

IX. Comparimn nf MouSft, Hiiman. Fly anW P.. t terflv 

10 The deduced mouse PTC protein sequence (SEQ ID NO: 10) has about 38% 
identical amino acids to fly PTC over about 1 ,200 amino acids. This amount of 
conservation is dispersed through much of the protein excepting the C-terminal 
region. The mouse protein also has a 50 amino acid insert relative to the fly 
protein. Based on the sequence conservation of PTC and the functional conservation 

15 of hedgehog between fly and mouse, one concludes that ptc functions simUarly in 
the two organisms. A comparison of the amino acid sequences of mouse (mptc) 
(SEQ ID NO: 10), human (hptc) (SEQ ID N0:19), butterfly (bptc)(SEQ ID NO:4) 
and drosophila (ptc) (SEQ ID N0:6) is shown in Table 1. 

TABT.F. 1 



20 



25 



30 



alignment of human, mouse, fly, and butterfly PTC homologs 

alignment of human, mouse, fly, and butterfly ptc homologs 

Jfr w ^W^I GALGRQAGGGRRRRTCGPHRA-APDRDYLHRPSYCDA 

PTC M DRDSLPRVPDTHGD—WDE KLFSDl. yi-RTSWVDA 

BPTC MVAPDSEAPSNPRITAAHESPCATEA- RHSADL YI-wISJSa 

AFALEQISKGKATGRKMLWXJWKKFQRLLFKLGCYIQKMCGKFLW 
ArM£QISKGKATGl«APLWLW\KFQRLLFKLGCYIQKNCGKrLWGLLIF^ 



HPTC 
MPTC 
PTC 
BPTC 



HPTC 
MPTC 
PTC 
BPTC 



HPTC 



AIAl^EMK^IEGGRTSLWIRAWLQEQLFILGCFI^GDAGKVLFVMLVLSTFCVGLKS 
^^IJIJIf^^'^^^^^^^^T^QKIGEEAMFNPQI^ 

AQIHTRAa)QLWVQEGGRI£AEUCyTAQALGEADSSTHQLVrQTAKDPDVSLU!PGALLE^ 
LDSAI^RVHVYmhmCVnCMHLCYKSGELITET-GYMDQIIEYLYPCLIITPLDCrTO 
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MPTC 

PTC 

BPTC 



10 



HPTC 
MPTC 
PTC 
BPTC 



HPTC 
MPTC 
PTC 
15 BPTC 



HPTC 
MPTC 

20 PTC 

BPTC 



HPTC 
25 MPTC 
PTC 
BPTC 



30 HPTC 
MPTC 
PTC 
BPTC 



35 



40 



HPTC 
MPTC 
PTC 
BPTC 



HPTC 
MPTC 
PTC 
45 BPTC 



HPTC 
MPTC 

50 PTC 

BPTC 



HPTC 
55 MPTC 
PTC 
BPTC 



60 HPTC 
MPTC 
PTC 
BPTC 



GSKLL-GPDYPIYVPHLKHKLQWTHLNPLEWEEVK-KL KFQFPLSTIt^i ^ 

* * * . * * ** ..* •• • 





65 



FNLGSILLVFPflMISLDLRRRSAARADLLCCLM-P ESP 

** 

LSCQSP 
LSCQSP 

-AKTRKNDKTHRlD-TTRQPLDPnVS 
• • • * • • 

• • • 

EKHYAPFLLKPKAKVWI FLFLGLLG 

ENVTK-1 ^ ★ ♦ 

VSLYCTTRVWJGUJLTDIVPRETREYDFIAAQIWre 

TSVWGATK\na»GIJ)LTDIVPEOTDBHEFWRQEKYrGFYNmAViw 

..♦...****. *•** * *• . • • 
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HPTC 
MPTC 
PTC 
BPTC 



riPTC 
MPTC 
PTC 
10 BPTC 



HPTC 
MPTC 
15 PTC 
BPTC 



HPTC 
20 MPTC 
PTC 
BPTC 



25 HPTC 
MPTC 
PTC 
BPTC 



30 



35 



HPTC 
MPTC 
PTC 
BPTC 



HPTC 
MPTC 
PTC 
40 BPTC 



HPTC 
MPTC 

45 PTC 

BPTC 



YHDQnmiPNIIKNDNGGLTKFWLSLFRDWLLDLQVAFDKEVASGCITQEYWCKNASDEG 
* * * *♦ _ ♦ ^ ^ ^ ^ 

• ♦ • . . ,***, ***** *** **^<*-* ^** 

NIRPHRPEWVHDKADYMPETRLRIPAAEPIEYAQFPFYLNGLRDTSDFVEAIEKVRTICS 

mvo^n«23^S!^P* DLKIPKSLPLVYAQMPFYLHGLTDTSQIKTLIGHIRDLSV 

MLKPQPQRWIHSPEDV HLEIKKSSPLIYTQLPFYLSGLSDTDSIKTLIRSVRDLCL 

•• • * . * * . *. *.* **** *• ♦» ^ • 

^r^J:™~!^*^^^"'"^^^^^"*"*'^^""^"^CTFLVCAVrLLNPWTAGIIVMV 
KYEGFGLPNYPSGIPFIFWEQVMTLRSSLAMIIACVLLAALVLVSLLLLSVWAAVLVILS 

CTEAKGLPNFPSGIPFLFWEQYLYLRTSLLlJUACWLGAVriAVMVLLLNAWAAVLVTLA 

• • **•••*•* •♦.****♦. _ ^ , 

lAU(4T\^LFGMMGLIGIiaSAVPWILlA^ 

LAIWTVELFGMMGLIGIKI^AVPVVILIASVGIGVEFTVHVALAFL^^ 

™,^^^°^^^^'*^^'^^'^^^^^^^V®®<I'CFNVLISLGFMrSVGNRQRRVQLSM 

IATLVLQLLGVMALLGVKLSflMPPVLLVLAIGRGVHFTVHLCI.GFVTSIGCKRRRASLAL 
••-* * *.*.**•*.* ***. ,.* . * * . . .* ... 

^^i^f^^fy^^^^'^^°^'^^°^^"'"VRYFFAVIJaLTVL6VIJ*GLVLLPVLLSFFG 
2«»?L«^°^^^*^^™^^^^^^^«"*'«^I'WLCVGACNSLLVFPILLSMVG 

ESVIAPVVHGAIJ«UUJVASMLAASEFGFVARLFLRLLIJU.VFLGLIDGLLFFPIVLSILG 

• ••*...* •♦.*•*.*♦ « , , .*..♦* * 

»^«^!^'^*^""*^"^^^^^^^'5^™P*'<5«THSGSDSSDSEYSSQTTVSGLSE-EL 
f^r^yf *^^^^^^^^^*'^^^^AVPPGHTNNGSDSSDSEYSSQTTVSGISE-EL 

PAAEVRPIEHPERLSTPSPKCSPIHPRKSSSSSGGGDKSSRTS—KSAPRPC APSL 

• • * • .♦..♦*** * . . • « « 

ni!3^^~®°^'^°^^'^^*'^*'*"STWHPESRHHPPSNPRQQPHLDSGSLPPGRQ 
22^Sf^°2^?S3^*"^^^'f^**PVFARSTVVHPDSRHQPPLTPRQQPHLDSGSLSPGRQ 

TTITEEPSSifHSSAHSVQSSMQSIWQPEVWETTTYNGSDSASGRSTPTKSSHGGAITT 



HPTC 
50 MPTC 
PTC 
BPTC 



^B^»f^''^**'^"'^^"*^^^S''*=^»SGPSNRARWGPRGARSHNPRNPASTflMG 
^So^r °"***^^*'*^'*'^^^^^'^^'^"^°PSNRDRSGPRGARSHNPRNPTSTAMG 
TPPPPFPTA YPPELQSIWQPEVTVETTHS DS 

TKVTATANIKVEWTPSDRKSRRSYHYYDRRRDRDEDRDRDRERDRDRDRDRDRDRDRDR 



55 HPTC 
MPTC 
PTC 
BPTC 



60 



65 



HPTC 
MPTC 
PTC 
BPTC 



°""^^'^^'^^^^^P<5PSRNPRGGLCPGY PETDHGLFEDPHVP 

SSVPSYCQPITTVTASASVTVAVHPP— PGPGRNPRGGPCPGYESYPETDHGVFEDPHVP 

NT TKVTATAMIKVELAMP GRAVRS YNFTS 

DR DRERSRERDRRDRYRD ERDHRA SPRENGRDSGHE 



FHVRCERRDSKVEVIELQDVECEERPRGSSSN 
FHVRCERRDSKVEVIELQDVECEERPtfGSSSN 



-SDSSRH 
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The identity of ten other clones recovered from the mouse library is not 
determined. These cDNAs cross-hybridize with mouse pre sequence, while differing 
as to their restriction maps. These genes encode a family of proteins related to the 
patched protein. Alignment of the human and mouse nucleotide sequences, which 
includes coding and noncoding sequence, reveals 89% identity. 



In accordance with the subject invention, mammalian patched genes, including 
the mouse and human genes, are provided which aUow for high level production of 

10 the patched protein, which can serve many purposes. The patched protein may be 
used in a screening for agonists and antagonists, for isolation of its ligand, 
particularly hedgehog, more particularly Sonic hedgehog, and for assaying for the 
transcription of the mRNA ptc. The protein or fragments thereof may be used to 
produce antibodies specific for the protein or specific epitopes of the protein. In 

15 addition, the gene may be employed for investigating embryonic development, by 
screening fetal tissue, preparing transgenic animals to serve as models, and the like. 

All publications and patent applications cited in this specification are herein 
incorporated by reference as if each individual publication or patent appUcation were 
20 specifically and individuaUy indicated to be incorporated by reference. 

Although the foregoing invention has been described in some detail by way of 
illustration and example for purposes of clarity of understanding, it will be readUy 
apparent to tiiose of ordinaiy skiU in die art in light of die teachings of tius invention 
25 that certain changes and modifications may be made thereto widiout departing from 
the spirit or scope of tiie appended claims. 



30 



24 



wo 96/11260 



PCT/US95/13233 



SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR 
UNIVERSITY 

(ii) TITLE OF INVENTION: Patched Genes and their Use 
(ill) NUMBER OF SEQUENCES i 19 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Flehr, Hohbach, Test, Albritton & Herbert 

(B) STREET: Four Embarcadero Center, Suite 3400 

(C) CITY: San Francisco 

(D) STATE: CA 

(E) COUNTRY: US 

(F) ZIP: 94111 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentin Release /l.O, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US95/ 

(B) FILING DATE: 06-OCT-199S 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Rowland, Bertram I 

(B) REGISTRATION NUMBER: 20015 

(C) REFERENCE/DOCKET NUMBER: a60190-l 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 415-781-1989 

(B) TELEFAX: 415-398-3249 



(2) INFORMATION FOR SEQ ID NOil: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 736 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDBDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (gencraiic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 
AACNNCNNTN NATGCCACCC CCNCCCAACC TTTNNNCCNN NTAANCAAAA NNCCCCNTTT 
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NATACUUt-u 1 


n X AAII AI>1 * 4 * 


TCCACCNNNC 


NNAAANNCCN 


CTGNANACNA 


NGNAAANCCN 


120 


TTTTTWAACU 


V« ^ W W w^«>A\« Vi* W 


GGAATTCCNA 


NTNNCCNCCC 


CCAAATTACA ACTCCAGNCC 


180 


AAAATTNANA 




TAACCTAACC 


NATNGTTGTT 


ACGGTTTCCC 


CCCCCAAATA 


240 


CATG C ACTGC 


r»r«rv2 a A f* ACT 


TCATCGTTGC 


CGTTCCAATA 


AGAATAAATC 


TGGTCATATT 


300 


aaacaagccm 


aA AfifTTTAC 
iUuw^X XXilti« 


AAACTGTTGT 


ACAATTAATG 


GGCGAACACG 


AACTGTTCXJA 


360 


ATTCTGO A w X 


<^Ar»ATTACA 


AAGTGCACCA 


CATCGGATGG 


AACCAGGAGA 


AGGCCACAAC 


420 


CGTACTGAAu 




A6AAGTTCGC 


ACAGGTTCGT 


GGTTGGCGCA 


AGGAGTAGAG 


480 




TAATTTTTGG 


TTCTTCCAGG 


AGGTGGATCG 


TCTGACGAAG 


AGCAAGAAGT 


540 




CATCTTCGTG 


ACCTTCTCCA 


CCGCCAATTT 


GAACAAGATG 


TTGAAGGAGG 


600 


CGTCGAANAC 


GGACGTGGTG 


AAGCTGGGGG 


Tuu I vv« X w 


GGTGGCGGCG 


GTGTACGGGT 


660 


GGGTGGCCCA 


GTCGGCGCTG 


GCTGCCTTGG 


GAGTGCTGGT 


CTTNGCGNGC 


TNCNATTCGC 


720 


CCTATAGTNA 


GNCGTA 










736 


(2) INFORMATION FOR SEQ ID NO: 2: 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNBSS; single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO:2: 

Xaa Pro Pro Pro Asn Tyr Asn Ser Xaa Pro Lys Xaa Xaa Xaa Leu Val 
1 5 10 15 

Leu Thr Pro Xaa Val Val Thr Val Ser Pro Pro Lys Tyr Met His Trp 
20 25 30 

Pro Glu His Leu lie Val Ala Val Pro He Arg He Asn Leu Val He 
35 40 « 

Leu Asn Lys Pro Lys Ala Leu Gin Thr Val Val Gin Leu Met Gly Glu 
50 55 60 

His Glu Leu Phe Glu Phe Trp Ser Gly His Tyr Lye Val His His He 
65 70 75 80 

GlY Trp Asn Gin Glu Lys Ala Thr Thr Val Leu Asn Ala Trp Gin Lys 
85 90 95 

Lys Phe Ala Gin Val Gly Gly Trp Arg Lys Glu 
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(2) INFORMATION FOR SBQ ID N :3j 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5187 base paira 

(B) TYPEt nucleic acid 

(C) STRANDEONESS: aingle 

(D) TOPOLOGY: litiaar 

<ii) MOLECOLB TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NOt3: 
GGGTCTCTCA CCCGOACCCG GAGTCCCCGC CGCCCAGCA6 
CCAGGCGCGC CCGGAGCCCG CGGCGGCGGC GGCAACATCG 
GCGGCCCTGG GCAGGCAGCC CGGCGGCGGC AGGCGCAGAC 
CCCGCGCCGG ACCGGGACTA TCTGCACCGG CCCAGCTACT 
GAGCAGATTT CCAAGGGGAA GGCTACTGCC CCGAAAGCGC 
TTTCAGAGAC TCTTATTTAA ACTCCGTTGT TACATTCAAA 
CTTGTGGGTC TCCTCATATT TCGGGCCTTC GCTGTGGGAT 
ACCAACGTCG AGCAGCTGTC GGTGCAAGTT GGTGGACGAG 
ACCCGTCAGA AGATAGGAGA AGAGGCTATG TTTAATCCTC 
AAAGAAGAAG GCGCTAATCT TCTCACCACA GAGCCTCTCC 
CTCCACGCCA GTCGTGTGCA CGTCTACATG TATAACAG6C 
TGCTACAAAT CACGCGAACT TATCACG6AG ACAGCTTACA 
CTTTACCCTT GCTTAATCAT TACACCTTTG CACTCCTTCT 
TCCGGGACAO CATACCTCCT AGGTAAGCCT CCTTTACGGT 
GAATTCCTAG AACAGTTAAA GAAAATAAAC TACCAAGTGG 
AATAAAGCCG AAOTTGCCCA TGCOTACATO GACCGGCCTT 
GATTGCCCTG CCACAGCCCC TAACAAAAAT TCAACCAAAC 
TTGAATGGTG GATCTCAAG6 TTTATCCAGG AAOTATATGC 
GTCCGTGGTA CCOTCAACAA TCCCACTGGA AAACTTCTCA 
ATGTTCCACT TAATCACTCC CAACCAAATG TATGAACACT 
TCTCACATCA ACTGCAATCA AOACAGCGCA GCCGCCATCC 



CGTCCTCGOG 
CCTCCGCTGG 
GGACCG6GGG 
GCGACGCCGC 
CCCTGTGGCT 
AGAACTGCGG 
TAAAGCCAGC 
TGAGTCCACA 
AACTCATGAT 
TGCAACACCT 
AATCGAACTT 
TGGATCAGAT 
GGGAAGGGGC 
GGACAAACTT 
ACAGCTGGGA 
GCCTCAACCC 
CTCTTGATGT 
ATTGGCACCA 
GCGCTCACCC 
TCAGGGGCTA 
TCCACGCCTG 



AGCCGAGCGC 
TAACGCCGCC 
ACCGCACCGC 
CTTCGCTCTG 
GAGAGCGAAG 
CAAGTTTTTG 
TAATCTCGAG 
ATTAAATTAT 
ACAGACTCCA 
GGACTCAGCA 
GGAACy^TTTG 
AATAGAATAC 
AAAGCTACAC 
TCACCCCTTC 
CCAAATGCTG 
AGCCGACCCA 
GGCCCTTGTT 
GGAGTTGATT 
CCTCCAAACC 
CCACTATGTC 
GCAGAGGACT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
X080 
1140 
1200 
1260 
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TACGTGGAGG 
ACAACCACGA 
GCCAGCGGCT 
TCCAAGTCCC 
GCAGGATTGG 
TTGCCGTTTC 
AGTGAAACAG 
CGCACCGGAG 
GCATTGATCC 
TTCAATTTTG 
CGTGAGGACA 
ATTCAAGTTG 
CCCCCATACA 
CAGCTCCGCA 
TCTGAGATCT 
GAGAGCACCA 
CTCGAGCCCC 
TTCCTCCTGA 
GTCAGCCTTT 
CGGGAAACCA 
ATGTATATAG 
CATAAGAGTT 
ATGTGGCTGC 
TGGGAAACTG 
GCTTACAAAC 
ACTAAACAGC 
CTGACCGCTT 
CCTCACCGGC 
ATCCCAGCAG 



TCGTTCATCA 
CCCTGGACGA 
ACCTACTGAT 
AGGGTGCCGT 
GCCTCTGCTC 
TTGCTCTTGG 
GACAGAATAA 
CCAGCGTGGC 
CTATCCCTGC 
CTATGGTTCT 
GAAGATTGGA 
ACCCACAGGC 
CCAGCCACAG 
CAGAGTATGA 
CTGTACAGCC 
GCTCTACCAG 
CCTGCACCAA 
AACCCAAAGC 
ATGGGACCAC 
GAGAATATGA 
TCACCCAGAA 
TCAGCAATCT 
ACTACTTTAG 
GGAGGATCAT 
TCCTGCTGCA 
GTCTGGTAGA 
GGGTCAGCAA 
CCGAGTCCGT 
CAGAGCCCAT 



AAGTGTCGCC 
CATCCTAAAA 
GCTTGCCTAT 
GGGGCTGCCT 
CTTGATTGGC 
TGTTGGTGTG 
GAGGATTCCA 
CCTCACCTCC 
CCTGCGAGCG 
GCTCATTTTT 
TATTTTCTGC 
CTACACAGAG 
CTTCGCCCAC 
CCCTCACACG 
TGTTACCGTC 
GGACCTGCTC 
GTGGACACTC 
CAAGGTTGTG 
CCGAGTGAGA 
CTTCATAGCT 
A6CAGACTAC 
GAAGTATGTC 
AGACTGGCTT 
GCCAAACAAT 
GACT6GCAGC 
CGCAGATGGC 
CX5ACCCTGTA 
CCATGACAAA 
CGAGTACX3CT 



CCAAACTCCA CTCAAAAGGT GCTTCCCTTC 
TCCTTCTCTG ATGTCACTGT CATCCGAGTG 
GCCTGTTTAA CCATGCTGCG CTGGGACTGC 
GGCGTCCTGT TGGTTGCGCT GTCAGTGGCT 
ATTTCTTTTA ATGCTGCGAC AACTCAGGTT 
GATGATGTCT TCCTCCTGGC CCATGCATTC 
TTTGAGGACA GGACTGGGGA GTGCCTCAAG 
ATCAGCAATG TCACCGCCTT CTTCATGGCC 
TTCTCCCTCC AGGCTGCTGT CGTGGTGGTA 
CCTGCAATTC TCAGCATGGA TTTATACAGA 
TGTTTCACAA GCCCCTGTGT CAGCACGGTG 
CCTCACAGTA ACACCCGGTA CAGCCCCCCA 
GAAACCCATA TCACTATGCA GTCCACCGTT 
CACGTGTACT ACACCACCGC CGAGCCACGC 
ACCCAGGACA ACCTCAGCTG TCAGAGTCCC 
TCCCAGTTCT CAGACTCCAG CCTCCACTGC 
TCTTCGTTTG CAGAGAAGCA CTATGCTCCT 
GTAATCCTTC TTTTCCTGGG CTTGCTGGGG 
GACGGGCTGG ACCTCACGGA CATTGTTCCC 
GCCCAGTTCA AGTACTTCTC TTTCTACAAC 
CCGAATATCC AGCACCTACT TTACGACCTT 
ATGCTGGAGG AGAACAAGCA ACTTCCCCAA 
CAAGGACTTC AGGATGCATT TGACAGTGAC 
TATAAAAATG GATCAGATGA CGGGGTCCTC 
CGAGACAAGC CCATCCACAT TAGTCACTTG 
ATCATTAATC CGAGCGCTTT CTACATCTAC 
GCTTACGCTG CCTCCCAGGC CAACATCCGG 
GCCGACTACA TGCCAGAGAC CAGGCTGACA 
CAGTTCCCrr TCTACCTCAA CGGCCTACGA 



1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 
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GACACCTCAG ACTTTGTOGA 
AGCCTGGGAC TGTCCAGCTA 
AGCCTGCGCC ACTGGCTGCT 
TGCGCAGTCT TCCTCCTGAA 
ATGACCGTTG AGCTCTTTGG 
GTGGTCATCC TCATTCCATC 
GCCTTTCTOA CAGCCATTGG 
TTTCCTCCCC TTCTGGACGG 
TCCGAATTTG ATTTCATTGT 
GGGGTTCTCA ATGGACTGGT 
GAGGTGTCTC CAGCCAATGG 
AGTGTCGTCC GCTTTGCCGT 
TCGGAGTACA GCTCTCACAC 
GCACAGCAGG GTGCCGGAGG 
GTCTTTGCCC GGTCCACTGT 
CGGCAACAGC CCCACCTGCA 
CCAAGGGATC CCCCTACAGA 
TTTCAAATTT CTACTGAAGG 
GGGGCCCGTT CTCACAACCC 
AGCTACTGCC AGCCCATCAC 
CCCCCCCCTG GACCTGGGCG 
CCTGAGACTG ATCACGGCGT 
AGGAGGGAC7 CAAAGGTGGA 
TGGGGGAGCA GCTCCAACTG 
AAGCCCCGCC CCCACCTCTT 
GGCACTTCAT TGTTACTGTA 
AARAGGTGTA CACATCTAAT 
CCACTCCTGC CCCAGAGTGG 
TGTGCCACAA CCAAOCTTAA 
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AOCCATAGAA AAAGTGAGAO TCATCTCTAA CAACTATACO 3060 

CCCCAATGGC TACCCCTTCC TCTTCTCGGA GCAATACATC 3120 

GCTATCCATC AGCGTCGTCC TCGCCTGCAC GTTTCTAGTG 3180 

CCCCTCGACG GCCCGCATCA TTGTCATCCT CCTCGCTCTG 3240 

CATCATCGCC CTCATTGGGA TCAACCTCAG TGCTGTGCCT 3300 

TGTTGCCATC CGACTCGACT TCACCGTCCA CGTGGCTTTG 3360 

CGACAAOAAC CACAGGGCTA TGCTCGCTCT GCAACACATG 3420 

TGCTGTGTCC ACTCTGCTGG GTGTACTGAT GCTTCCAGGG 3480 

CAGATACTTC TTTCCCGTCC TGGCCATTCT CACCCTCTTG 3540 

TCTGCTGCCT GTCCTCTTAT CCTTCTTTGG ACCGTGTCCT 3600 

CCTAAACCGA CTGCCCACTC CTTCGCCTGA GCCCCCTCCA 3660 

GCCTCCTCCT CACACGAACA ATGGGTCTGA TTCCTCCGAC 3720 

CACGGTGTCT GGCATCAGTG AGGAGCTCAG GCAATACCAA 3780 

CCCTGCCCAC CAAGTGATTG TGGAAGCCAC AGAAAACCCT 3840 

GGTCCATCCG GACTCCACAC ATCAGCCTCC CTTGACCCCT 3900 

CTCTGGCTCC TTGTCCCCTG GACGGCAAGG CCAGCACCCT 3960 

ACCCTTCCGG CCACCCCCCT ACACACCGCG CAGAGACX3CT 4020 

OCATTCTCGC CCTAGCAATA GGGACCGCTC AGGCCCCCGT 4080 

TCGGAACCCA ACGTCCACCG CCATGGGCAC CTCTGTGCCC 4140 

CACTGTGACO GCTTCTGCTT CGGTGACTGT TGCTGTGCAT 4200 

CAACCCCCCA GCCGGGCCCT GTCCACGCTA TGAGAGCTAC 4260 

ATTTCACCAT CCTCATCTGC CTTTTCATGT CACGTGTGAG 4320 

GGTCATAGAG CTACAGOACG TGGAATGTGA GGAGAGGCCG 4380 

AGGGTAATTA AAATCTGAAG CAAAGAGGCC AAA6ATTGGA 4440 

TCCACAACTG CTTGAAGAGA ACTGCTTCCA ATTATGGGAA 4500 

ACTGATTGTA TTATTKKGTG AAATATTTCT ATAAATATTT 4560 

ATACATGGAA ATGCTGTACA GTCTATTTCC TGGGGCCTCT 4620 

GGAOACCACA GGGGCCCTTT CCCCTGTCTA CATTGGTCTC 4680 

CTTAGTTTTA AAAAAAATCT CCCAGCATAT GTCGCTGCTG 4740 
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CTTAAATATT GTATAATTTA CTTOTATAAT TCTATCCAAA 


TATTGCTTAT 


GTAATAGGAT 


4800 


TATTTGTAAA GGTTTCTCTT TAAAATATTT TAAATTTGCA TATCACAACC 


CTGTGGTAGG 


4860 


ATGAATTGTT ACTGTTAACT TTTGAACACG CTATGCGTGG 


TAATTGTTTA 


ACGAGCAGAC 


4920 


ATGAAGAAAA CACGTTAATC CCAGTGCCTT CTCTAGGCGT 


AGTTGTATAT 


GGTTCGCATC 


4980 


GGTGGATGTG TGTGTGCATG TGACTTTCCA ATGTACTGTA 


TTGTGGTTTG 


TTGTTGTTGT 


5040 


TGCTGTTGTT GTTCATTTTG GTGTTTTrGG TTGCTTTCTA 


TGATCTTAGC 


TCTGGCCTAG 




GTGGGCTGGG AAGGTCCAGC TCTTTTTCTG TCGTGATGCT 


GGTGGAAAGG 


TGACCCCAAT 


5160 








5187 


CATCTGTCCT ATTCTCTGGG ACTATTC 








(2) INFORMATION FOR SEQ ID N0i4j 









(1) SEQUENCE CHARACTERISTICS: 

(A> LENGTH: 1311 amino acids 

(B) TYPE: amino acid 

(C) STRAMDEONBSS i single 

(D) TOPOLOOIf: Linear 



(ii) MOLECOLE TYPES protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Val Ala Pro Asp Ser Glu Ala Pro Ser Asn Pro Arg He Thr Ala 



Ala His Glu ser Pro Cy. Al. Thr Glu Ala Arg His Ser Ala Asp Leu 
20 25 

Tyr lie Arg Thr Ser Trp Val Asp Ala Ala Leu Ala Leu Ser Glu Leu 

35 

Glu Lys Gly Asn He Glu Cly Gly Arg Thr Ser Leu Trp He Arg Ala 

50 55 60 

Trp Leu Gin Glu Gin Leu Phe He Leu Gly Cys Phe Leu Gin Gly Asp 
65 '0 " 

Ala Cly Ly. Val Leu Phe Val Ala He Leu Val Leu Ser Thr Phe Cys 

85 50 
val Gly Leu Lys Ser Ala Gin He His Thr Arg Val Asp Gin Leu Trp 



100 



105 



val om Glu Oly Gly Arg Leu Glu Ala Glu Leu Lys Tyr Thr Ala Oln 



115 



120 125 



Ala Leu Gly Glu Ala Asp Ser Ser Thr Hie Gin Leu Val U Gin Thr 

135 



130 



30 
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Ala Lys A«p Pro Asp Val Ser Lou Leu Hi- . 

145 ^ <51y Ala Leu Leu Glu 

160 

Hia Leu tys Val Val His Al. Ma Thr Arg Val xhr Val His Met Tyr 

170 275 

».p n. «u T^^ ^. 

ll's «- «.r ". n. ».p ».„ V.1 n. 

205 

pre cy. Ala II. He Th, Pro Leu Asp cy. Phe Trp Clu Cly Ser .ys 

2" 220 
Leu Leu Oly Pro A.p Xyr Pro He Tyr Val Pro Hi. Leu Ly. His Lys 

240 

Leu cm Trp Thr Hi. Leu A.„ Pro L.u Clu Val Val Clu Clu Val Lys 

250 255 
I-y. L.U Ly. oin Ph. Pr, ^ 5.r Thr n. M, K.t Ly. 

265 270 
Arg Ala Cly xi. Thr Ser Ala Tyr Met Ly. Ly. Pro Cy. Leu A.p Pro 

285 

Thr A.p Pro Hi. cy. Pro Ala Thr Ala Pro A.n Ly. Ly. Ser Cly Hi. 

300 

lie Pro A.P val Ala Ala Clu Leu Ser Hi. Cly Cy. Tyr Cly Phe Ala 

315 

Ala Ala Tyr Met Hi. Trp Pro Clu Cln Leu He Val Cly Cly Ala Thr 

330 

Arg A.„ ser Thr Ser Ala l.u Arg Ly. Ala Arg Xaa Leu Cln Thr Val 

-^^^ 350 

v.i oi„ „, ^au^ry, 

365 

ry^ jy. v.x hu „x. „. „ ^ ^„ 

380 

t^u A.P Ala Trp Cln Arg Ly. Phe Ala Ala Clu Val Arg Ly. He Thr 

Thr s.. Cl, s.. V.1 s.. Al. ryr s,r Ph. Tyr Pro Ph. s.r Thr 

410 

"r Thr ^ j.„ MP 11. ^ , ^ 



415 

Phe Ser Glu Val Ser 
«S 430 

A.n II lie Leu cly Tyr Met Ph Met Leu lie Tyr Val Ala Val Thr 



440 



445 



l-u II Cln Trp Arg A.p Pro lie Arg Ser Cln Ala Gly Val Cly He 
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450 455 «0 

Ala Gly Val Leu Leu Leu Ser He Thr Val Ala Ala Gly Leu Gly Phe 



46S 



470 



cya Ala Leu Leu Gly He Pro Phe Asn Ala Ser Ser Thr Gin lie Val 
485 

pro Phe Leu Ala Leu Gly Leu Gly Val Gin Asp Met Phe Leu Leu Thr 
500 505 

Hi8 Thr Tyr Val Glu Gin Ala Gly Asp Val Pro Arg Glu Glu Arg Thr 

515 520 
Gly Leu val Leu Lys Lys Ser Gly Leu Ser Val Leu Leu Ala Ser Leu 

530 535 
Cys Asn val Met Ala Phe Leu Ala Ala Ala Leu Leu Pro He Pro Ala 
545 550 555 

^ ^ T &1a ala He Leu Leu Leu Phe Aon Leu 

Phe Arg Val Phe cya Leu Gin Ala Ala lie i-eu 
565 570 

Oly ser He Leu Leu Val Phe Pro Ala Met He Ser Leu Asp Leu Arg 
580 

Arg ser Ala Ala Arg Ala Asp Leu Leu Cys Cys Leu Met Pro Glu 



595 



600 



Lys Lys LyB He Pro Glu Arg Ala Lys Thr Arg Lys 



615 



620 



Ser Pro Leu Pro 
610 

Asn ASP Lys Thr His Arg He Asp Thr Thr Arg Gin Pro Leu Asp Pro 



625 



630 



ASP val ser Glu Asn Val Thr Lys Thr cys Cys Leu Ser Val Ser Leu 
645 ^50 

Thr Lys Trp Ala Lys Aen Gin Tyr Ala Pro Phe He Met Arg Pro Ala 
660 

val Lys val Thr Ser Met Leu Ala Leu He Ala Val He Leu Thr Ser 

675 

val Trp Gly Ala Thr Lya Val Lys Asp Gly Leu Asp Leu Thr Asp He 
690 

val pro Glu Asn Thr Asp Glu His Glu Phe Leu Ser Arg Gin Glu Lys 
705 7" 

Tyr Phe Gly Phe Tyr Asn Met Tyr Ala Val Thr Gin Gly Asn Phe Glu 
725 '30 

Tyr pro Thr Asn Gin Lys Leu Leu Tyr Glu Tyr His Asp Gin Phe Val 
740 ''*5 

Arg n pro Asn He He Lys Aen A«p Asn Gly Gly Leu Thr Lys Phe 



•ifin 765 
755 760 
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Trp L»M Sar Leu Phe Arg Asp Trp Leu Leu Asp Leu Gin Val Ala Phe 
"0 775 780 

Asp Lys Clu Val Ala Ser Gly Cys He Thr Gin Glu Tyr Trp cya Lys 
■'^^ 790 795 800 

Asn Ala ser Asp Glu Gly He Leu Ala Tyr Lye Leu Met Val Gin Thr 

810 815 

Gly His Val Asp Asn Pro He Asp Lys Ser Leu He Thr Ala Gly His 
820 825 830 

Arg Leu Val Asp Lys Asp Gly He He Asn Pro Lys Ala Phe Tyr Asn 
835 840 845 

Tyr Leu Ser Ala Trp Ala Thr Asn Asp Ala Leu Ala Tyr Gly Ala Ser 
850 855 860 

Gin Gly Asn Leu Lys Pro Gin Pro Gin Arg Trp He His Ser Pro Glu 

870 875 880 

Asp Val His Leu Glu He Lys Lys Ser Ser Pro Leu He Tyr Thr Gin 
885 890 895 

Leu Pro Phe Tyr Leu Ser Gly Leu Ser Asp Thr Xaa ser He Lys Thr 
900 905 910 

Leu He Arg Ser Val Arg Asp Leu Cys Leu Lys Tyr Glu Ala Lys Gly 
915 920 325 

Leu Pro Asn Phe Pro Ser Gly He Pro Phe Leu Phe Trp Glu Gin Tvr 
930 935 940 

Leu Tyr Leu Arg Thr Ser Leu Leu Leu Ala Leu Ala Cys Ala Leu Ala 

950 955 960 

Ala Val Phe He Ala Val Met Val Leu Leu Leu Asn Ala Trp Ala Ala 
985 970 975 

val Leu Val Thr Leu Ala Leu Ala Thr Leu Val Leu Gin Leu Leu Gly 
980 985 990 

Val Met Ala Leu Leu Gly Val Lys Leu Ser Ala Met Pro Ala Val Leu 
995 1000 1005 

Leu Val Leu Ala He Gly Arg Gly Val His Phe Thr Val His Leu Cys 
"10 1015 1020 

Leu Gly Phe Val Thr Ser He Gly Cys Lys Arg Arg Arg Ala Ser Leu 

1030 1035 1040 

Ala Leu Glu Ser Val Leu Ala Pro Val Val His Gly Ala Leu Ala Ala 
1045 1050 1055 

Ala Leu Ala Ala Ser Met Leu Ala Ala Ser Glu Cys Gly Phe Val Ala 
1080 1065 1070 

Arg Leu Phe Leu Arg Leu Leu Leu Asp He Val Phe Leu Gly Leu He 
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1075 1080 1085 

Asp Gly LOU Leu Phe Phe Pro lie Val Leu Ser He Leu Gly Pro Ala 
1090 1095 1100 

Ala Glu Val Arg Pro He Glu His Pro Glu Arg Leu ser Thr Pro Ser 
1105 1110 1115 1"0 

pro Lys Cya Ser Pro He His Pro Arg Lys Ser Ser Ser Ser Ser Oly 
1125 1130 1135 

Gly Gly Asp Lye Ser Ser Arg Thr Ser Lys Ser Ala Pro Arg Pro Cys 
1140 1145 1150 

Ala Pro Ser Leu Thr Thr He Thr Glu Glu Pro Ser Ser Trp His Ser 
1155 1160 1165 

ser Ala His Ser Val Gin Ser Ser Met Gin Ser He Val Val Gin Pro 
1170 1175 1180 

Glu Val Val Val Glu Thr Thr Thr Tyr Asn Gly Ser Asp Ser Ala Ser 
1185 1190 1195 1200 

Gly Arg Ser Thr Pro Thr Lys ser ser His Gly Gly Ala He Thr Thr 
1205 1210 1215 

Thr Lys val Thr Ala Thr Ala Asn He Lys Val Glu Val Val Thr Pro 
1220 1225 1230 

ser Asp Arg Lys Ser Arg Arg Ser Tyr His Tyr Tyr Asp Arg Arg Arg 
1235 1240 1245 

Asp Arg Asp Glu Asp Arg Asp Arg Asp Arg Glu Arg Asp Arg Asp Arg 
1250 1255 1260 

Asp Arg Asp Arg Aap Arg Asp Arg Asp Arg Asp Arg Asp Arg Asp Arg 
1265 1270 1275 1280 

Glu Arg Ser Arg Glu Arg Asp Arg Arg Asp Arg Tyr Arg Asp Glu Arg 
1285 1290 1295 

Asp His Arg Ala Ser Pro Arg Glu Lys Arg Gin Arg Phe Trp Thr 
1300 1305 1310 

(2) INFORMATION FOR SEQ ID KG: 5: 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 4434 base pairs 

(B) TYPEt nucleic acid 
<C) STRANDEDNBSSt single 
(D) TOPOlOCy: linear 

(ii) MOLECOLB TYPE J cDNA 



(Xt) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
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OCAAACAAOA OAOCGAOTOA GACTACOCAO ACCCTCTGTO TTCTGTOTTO ACTGTCGCCC 
ACOCACACAO GCGCAAAACA OTGCACACAC ACOCCCGCTO CGCAACACAG AGTGAOAGAG 
AGAAACAGCC GCCCGCGCTC GCCTAATGAA GTTCTTGCCC TGGCTGGCGT GCCCCATCCA 
CGAGATACAG ATACATCTCT CATGOACCGC GACAGCCTCC CACGCCTTCC GGACACACAC 
GCCGATGTGG TOGATCAGAA ATTATTCTCG GATCTTTACA TACGCACCAG CTGGGTGGAC 
GCCCAAGTGC C6CTC0ATCA GATACATAAG CGCAAAGCGC GTGGCAGCCG CACCGOGATC 
TATCTGCGAT CAOTATTCCA GTCCCACCTC GAAACCCTOG GCAGCTCCGT GCAAAAGCAC 
GCCGGCAAGC TGCTATTCCT GCCTATCCTG GTGCXGAGCA CCTTCTGC6T CG6CCTGAAG 
ACCGCCCAGA TCCACTCCAA GGTCCACCAC CTGTGGATCC AGGACCGCCG CCGGCTCGAG 
GCGGAACTGC CCTACACACA GAAGACGATC CGCGAGGACG AGTCGGCCAC GCATCAGCTG 
CTCATTCAGA CXJACCCACCA CCCCAAOOCC TCCGTCCTGC ATCCGCAGGC GCTCCTTGCC 
CACCTGGAGG TCCTGGTCAA GCCCACCGCC CTCAACGTCC ACCTCTACOA CACCGAATGC 
GGGCTGCGCG ACATGTGCAA CATOCCGAGC ACGCCCTCCT TCGAGCGCAT CTACTACATC 
GAGCAGATCC TGCGCCACCT CATTCOGTCC TOGATCATCA CCCCGCTGGA CTCTTTCTCG 
GAGGGAAOCC AGCTGTT«GC TCCGCAATCA 6CGGTCGTTA TACCAGGCCT CAACCAAOGA 
CTCCTGTGGA CCACCCTCAA TCCOCCCTCT GTCATGCAGT ATATGAAACA AAACATGTCC 
GAGGAAAAGA TCAGCTTOGA CTTCCAGACC GTGGAGCAGT ACATGAAGCG WCCGCCATT 
CGCAGTGGCT ACATGGACAA GCCCTGCCTG AACCCACTCA ATCCCAATTG CCCXK!ACACG 
GCACCGAACA AOAACAGCAC CCAOCCCCCG GATGTGGGAG CCATCC;rGTC CGGACGCTGC 
TACGGTTATG CCGCGAAOCA CATGCACTCG CCCGAGGAGC TCATTGTCGG CGGACGGAAG 
AGGAACOGCA GCGGACACTT 6AGGAAGGCC CAGGCCCTGC AGTCCGTGGT GCACCTGATG 
ACCGAGAAGG AAATCTAOCA CCAGIGGCAC GACAACTACA ACGTGCACCA TCTTCGATGG 
ACXSCAGGAGA AGGCAGCGGA GGTTTTCAAC GCCTGGCAGC GCAACTTTTC CCGGGAGGTG 
GAACAGCTGC TACGTAAACA GTCGAGAATT GCCACCAACT ACGATATCTA CGTGTTCAGC 
TCGGCTGCAC TCGATGAOVT CCTGGCCAAG TTCTCCCATC CCACCGOCTT CTCCATTGTC 
ATCCGCGTGG CCGTCACCGT TTTCTATGCC TTTTGCACGC TCCTCCCCTG GAGGGACCCC 
GTCCGTGGCC ACACCAOTGT CGGC6TGCCC 0CA6TTCTGC TCATGTGCTT CAGTACOGCC 
CCOGGATTCG GATTGTCAOC CCTGCTCCGT ATCOTTTTCA ATCOGCTCAC CCCTCCCTAT 
OCGGAGAGCA ATCOCCGCCA OCAOACCAAG CTCATTCTCA ACAAC6CCAG CACCCAGCTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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GTTCCGTTTT 
CTGTTCACTG 
GCTTTGAAGG 
CTATTGGTTT 
GACATCTTCT 
CTGCCGCTGA 
ACGGTGCCGC 
AGTCACTCAC 
CTCATGCGCA 
AGCTTCTATG 
GACAGCAACG 
TATGCGGTTA 
CATGATTCCT 
TTCTGGCTGC 
TACCGCGACG 
CTGGCCTACA 
CTCCTCACCA 
TATCTGTCGG 
TATCCGGAAC 
AGTCTGCCAT 
CAGATCAAGA 
CTGCCCAACT 
TCCTCACTGG 
CTCCTGCTCT 
CAGATCTTTG 
CTCATCCTCA 
ACATCCGTTG 
CTTGTCCACG 
GAGTTTGTCA 



TGCCCCTTGG 
CCTGCAGCAC 
TATTCTGTCT 
TTCOGGCCAT 
GCTGCTGTTT 
ACAACAACAA 
TGCCCCCCCA 
TGGCGTCCTT 
CCTGGGTCAA 
CCTCCACCCG 
AGCACAAGTT 
CCCAGGGCAA 
TTGTGCGGGT 
TGCTCTTCAC 
GACG6CT6AC 
AGCTAATCGT 
ATCGCCTGGT 
CATGCGCCAC 
CGCGCCAGTA 
TGGTCTACGC 
CCCTCATAGG 
ATCCATCGGG 
CCATGATCCT 
CCX3TTTGGGC 
GGGCCATGAC 
GOGTGGGCAT 
GGAACCGACA 
GCATGCTGAC 
TCCGGCACTT 



TCTGGOCCTC 
CGCAGGATCC 
GCAGGCTGCC 
GATTTCGTTG 
TCCGGTGTGG 
CGGGCGOGGG 
CAATCCTCTG 
CTCCCTGGCA 
GTTCCTCACC 
CCTTCAGGAT 
CCTGGATGCT 
CTTTGAATAT 
GCCACATGTG 
CGAGTGGCTG 
CAAGGAGTGC 
GCAAACCGGC 
CAACAGCGAT 
CAACGACCTC 
TTTTCACCAA 
TCAGATGCCC 
TCATArrCGC 
CATTCCCTTC 
GGCCTGCGTG 
CGCCGTTCTC 
TCTGCTGGGC 
GATGCXGTGC 
GCGCCGCCTC 
CTCCGGAGTG 
CTGCTGGCTT 



GATCACATCT 
TTCTTTGCGG 
ATCGTAATGT 
GATCTATCGA 
AAGGAACAGC 
GCCCGGCATC 
CTGGAACAGA 
ACCTTCGCCT 
CTTATGGGTT 
GGCCTGGACA 
CAAACTCGGC 
CCCACCCAGC 
ATCAAGAATG 
GGTAATCTGC 
TCGTTCCCAA 
CATGTGGACA 
GGCATCATCA 
TTCGCCTACG 
CCCAACGAGT 

rrrTACCTCc 

GACCTGAGCG 
ATCTTCTGGG 
CTACTCGCCG 
GTGATCCTCA 
ATCAAACTCT 
TTCAATGTGC 
CAGCTGAGCA 
GCCGTGTTCA 
CTGCTGGTGG 



TCATAGTGGG 
CCGCCTTTAT 
GCTCCAATTT 
GACGTACCGC 
CGAAGGTGGC 
CGAAGAGCTG 
GGGCAGACAT 
TTCAGCACTA 
TCCTGGCGGC 
TTATTGATCT 
TCTTTGGCTT 
AGCAGTTGCT 
ATAACGGTGG 
AAAAGATATT 
ACGCCAGCAG 
ACCCCGTGGA 
ACCAACGCGC 
GAGCTTCTCA 
ACGATCTTAA 
ACGGACTAAC 
TCAAGTACGA 
AGCAGTACAT 
CCCTGGTGCT 
GCGTTCTGGC 
CGCCCATTCC 
TGATATCACT 
TGCAGATGTC 
TGCTCTCCAC 
TCTTATGCGT 



ACCCAGCaVTC 
TCCGGTGCCG 
GGCAGCGGCT 
CGGCAGGGCG 
ACCTCCGGTG 
CAACAACAAC 
CCCTGGGAGC 
CACTCCCTTC 
CCTCATATCC 
GGTGCCCAAG 
CTACAGCATG 
CAGGGACTAC 
ACTGCCGGAC 
CGACGAGGAA 
CGATGCCATC 
CAAGGAACTG 
CTTCTACAAC 
CGCCAAATTG 
GATACCCAAG 
AGATACCTCG 
GGGCTTCCGC 
GACCCTGCGC 
GGTCTCCCTG 
CTCGCTGGCC 
GCCAGTCATA 
CCGCTTCATG 
CCTGGGACCA 
CTCGCCCTTT 
TGGCGCCTGC 



1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 
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AACAGCCTTT TGGTGTTCCC CATCCTACTG AGCATGGTGG 
CCGCTGGAGC ATCCAGACCC CATATCCACG CCCTCTCCGC 
AGATCGGGCA AATCCTATGT GGTGCAGGGA TCGCGATCCT 
TCGCATCACC ACCACCACAA AGACCTTAAT GATCCATCCC 
CCXSCAGTCXJT GGAAGTCCAG CAACTCCTCC ATCCAGATGC 
CCGCX^GGAAC A6CGACCCGC CTCCTACCCG CCCCCGCCCC 
GCCCAGCAGC ACCACCAGCA TCAGGGCCCG CCCACAACGC 
GCCTATCCGC CGGAGCTCCA GAGCATCGTG GTGCAGCCCG 
CACTCGGACA GCAACACCAC CAAGGTGACG GCCACGGCCA 
ATCCCCCGCA GGCCGGTGCG CAGCTATAAC TTTACGACTT 
TAGCTATTAG GACGTATCTT TAGACTCTAG CCTAAGCCGT 
AATCGATTTG TCCAGCGGGT CTGCTGAGGA TTTCGTTCTC 
ATGGATGCTT AAATGGCATG GTAATTGGCA AAATATCAAT 
CATTAGCTTA TGGTTTCAAG ATACATTTTT AAAGAGTCCG 
AATCCAAAAT CGACGTATCC ATGAAAATTG AAAAGCTAAG 
TGTGTATGCA TGTTAGTTAA TTTCCCGAAG TCCGGTATTT 
(2) INFORKATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1285 amino acids 

(B) T3fP£: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGy: linear 

(ii) MOLECULE TYPE: protein 



PCr/DS95/13233 

GACCGGAGGC GGAGCTGGTG 3540 

TGCCCCTGCG CAGCAGCAAG 3600 

CGCGAGGCAG CTGCCAGAAG 3660 

TGACGACGAT CACCGAGGAG 3720 

CCAATGATTG GACCTACCAG 3780 

CCCCCTATCA CAAGGCCGCC 3840 

CCCCGCCTCC CTTCCCGACG 3900 

AGGTGACGGT GG AGACGACG 3960 

ACATCAAGGT GGAGCTGGCC 4020 

AGCACTAGCA CTAGTTCCTG 4080 

AACCCTATTT GTATCTGTAA 4140 

ATGGATTCTC ATGGATTCTC 4200 

TTTTGTGTCT CAAAAAGATG 4260 

CCAGATATTT ATATAAAAAA 4320 

CAGACCCGTA TGTATGTATA 4380 

ATAGCAGCTG CCTT 4434 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Asp Arg Asp Ser Leu Pro Arg Val Pro Asp Thr His Gly Asp Val 
^5 10 15 

val Asp Glu Lys Leu Phe Ser Asp Leu Tyr He Arg Thr Ser Trp Val 
20 25 30 

Asp Ala Gin Val Ala Leu Asp Gin He Asp Lys Gly Lys Ala Arg Gly 



40 



45 



Ser Arg Thr Ala He Tyr Leu Arg Ser Val Phe Gin Ser Hie Leu Glu 

55 60 
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Thr Leu Gly Ser Ser Val Gin Ly. His Ala Cly Lys Val Leu Phe Val 

65 ^0 " 

Ma lie Leu val Leu Ser Thr Phe Cys Val Gly Leu Lye Ser Ala Gin 



85 90 



lie Hi. ser Lye Val Hie Gin Leu Trp He Gin Glu Gly Gly Arg Leu 
100 

Clu Ala Glu Leu Ala Tyr Thr Gin Lye The He Gly Glu Asp Glu Ser 



115 



120 



Ala Thr His Gin Leu Leu II. Gin Thr Thr His Asp Pro Asn Ala Ser 



130 

His Pro Gin 

150 



135 



val Leu His Pro Gin Ala Leu Leu Ala His Leu Clu Val Leu Val Lys 



Ma Thr Ala Val Lys Val His Leu Tyr Asp Thr Glu Trp Gly I.u Arg 
165 

»BP H.. »o se. Thr pro sar Ph. OU oly Jl. Tyr Tyr 

180 

IXe Glu Gin lie Leu Arg His Leu He Pro Cys Ser lie He Thr Pro 

195 200 
Leu ASP cys Phe Trp Glu Gly Ser Gin Leu Leu Gly Pro Glu Ser Ala 



210 



215 



val val He Pro Gly Leu Asn Gin Arg Leu Leu Trp Thr Thr Leu Asn 
225 

P.O Ala ser Val Met Oln Tyr Met Lys Gin Lys Met Ser Glu Glu Lys 



245 



lie ser Phe Asp Phe Glu Thr Val Clu Gin Tyr Met Lys Arg Ala Ala 



260 



265 



lie Gly ser Gly Tyr Met Glu Lys Pro Cys Leu Asn Pro Leu Asn Pro 



275 



;.sn cys Pro Asp Thr Ala Pro Asn Lys Asn ser Thr Gin Pro Pro Asp 



290 



295 



val Gly Ala He Leu Ser Gly Gly Cys Tyr Gly Tyr Ala Ala Lys His 
305 3" 

Met His Trp pro Glu Glu Leu He Val Gly Gly Arg Lys Arg Asn Arg 



325 



ser Gly His Leu Arg Lys Ala Gin Ala Leu Gin Ser Val Val Gin Leu 



340 



345 



Met Thr Glu Lys Glu Met Tyr Asp Gin Trp Gin Asp Asn Tyr Lys Val 

355 

His His Leu ly Trp Thr Gin Glu Lys Ala Ala Clu Val Leu Asn Ala 
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"° 375 

III ^9 P*** III Val Glu Gin Leu Leu Arg Lye Gin 

395 

Ser Arg lie Ala Thr Asn Tyr Asp He Tyr Val Phe Ser ser Ala Ala 



405 410 



415 



Leu Asp Asp lie Leu Ala Ly. Phe Ser Hie Pro Ser Ala Leu Ser He 

*25 430 



Val He Gly Val Ala Val Thr Val Leu Tyr Ala Phe Cya Thr Leu 



435 440 



Leu 



445 



Arg Arg ABp Pro Val Arg Gly cin Ser Ser Val Gly Val Ala Gly 

455 4gQ 

val Leu Leu Met Cys Phe Ser Thr Ala Ala Gly Leu Gly Leu Ser Ala 

475 43Q 

Leu Leu Gly H. yal Phe Asn Ala Leu Thr Ala Ala Tyr Ala Glu Ser 



485 490 



495 



Asn Arg Arg Glu Gin Thr Lys Leu He Leu Lys Asn Ala Ser Thr Gin 
500 505 510 

Val val Pro Phe Leu Ala Leu Gly Leu Gly Val Asp His He Phe He 

520 

val Gly Pro Ser He Leu Phe Ser Ala Cys Ser Thr Ala Gly ser Phe 

535 540 

Phe Ala Ala Ala Phe He Pro Val Pro Ala Leu Lys Val Phe Cys Leu 

"° 555 

cm Ala Ala He Val Met Cy. Ser Asn Leu Ala Ala Ala Leu Leu Val 

570 575 

Phe Pro Ala Met He Ser Leu Asp Leu Arg Arg Arg Thr Ala Gly Arg 
580 585 

Ala Asp He Phe Cys Cys Cys Phe Pro Val Trp Lys Glu Gin Pro Lys 

600 505 

val Ala Pro Pro Val Leu Pro Leu Asn Asn Asn Asn Gly Arg Gly Ala 

«15 620 

Arg His Pro Lys Ser Cys Asn Asn Asn Arg Val Pro Leu Pro Ala Gin 

"° 635 640 

Asn Pro Leu Leu Glu Gin Arg Ala Asp He Pro Gly Ser Ser His Ser 



650 655 



Leu Ala Ser Phe Ser Leu Ala Thr Phe Ala Phe Gin His Tyr Thr 



665 670 



Pro 



Ph Leu Met Arg Ser Trp Val Ly. Phe Leu Thr Val Met Gly Phe Leu 
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Ti« sar ser Leu Tyr Ala Ser Thr Arg Leu Gin Asp Gly 
Ala Ala Leu He Ser ser ^eu y 

690 

X.. »P - v.. pro .V. »P - -0 

705 

^ „p «. «^ «, «u P.» 01, .5. Tvr se. K.. T.. v.. 

725 

oh« Glu Tvr pro Thr Gin Gin Gin Leu Leu Arg Asp 
Thr Gin Gly Asn Phe Glu Tyr era 
740 

o Ph« Arc Val pro His Val He Lye Asn Asp Asn Gly 
Tyr His Asp Ser Phe Arg vax r 
755 

„ Phe Trp Leu Leu Leu Phe Ser Glu Trp Leu Gly Asn 

Gly Leu pro Asp Phe Trp 
770 

o:» W. n. p». «J "» ^' ^ 

785 

- - - 55 "* - 

805 



^„ X.. v.. o.n o»;ku v.. «p x.„ pro v.. ..P .v; - 

820 ^''^ 

. „_ Tie lie Asn Gin Arg 

val Leu Thr Asn Arg Leu Val Asn Ser Asp Gly 

835 

T Ala Trp Ala Thr Asn Asp Val Phe Ala 

Ala Phe Tyr Asn Tyr Leu Ser Ala Trp 

850 

s.r «, .V. - ^r pro Pro «, .vr PJ. 

865 

c.„ pro «n «1» T,r L.u U. Pro S" - - ^ 



885 

0I» K« pro P- T,r Lju - - "J ™' 

900 

, . «i« lie Arg Asp Leu Ser Val Lys Tyr 
Gin lie Lye Thr Leu He Gly HU lie Arg P 



915 

o.. 01, p.. - - - ^' Jio "° "* 

930 

OX» «.« Tvr H.t T,. ser .er ^« M. «.t Ue «u «. 

945 

, • trai <;<»r Leu Leu Leu Leu Ser 
cys val Leu Leu Ala Ala Leu Val Leu Val Ser Leu 
^ 965 ^' 

v.. Trp v.. v.. .X. ^ ser V.X ^ - x-« «. 

980 ^^'^ 
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995 1000 



1005 



Pro Ala Val He Leu He Leu Ser Val Oly Met Met Leu Cys Phe Asn 
1010 1015 1020 

Val Leu He Ser Leu Gly Phe Met Thr Ser Val Gly Aan Arg Gin Arq 
1°25 1030 1035 loJo 

Arg Val Gin Leu Ser Met Gin Met Ser Leu Gly Pro Leu Val Hie Cly 
1045 1050 10S5 

Met Leu Thr Ser Gly Val Ala Val Phe Met Leu Ser Thr Ser Pro Phe 
1060 1065 1070 

Olu Phe Val He Arg His Phe Cys Trp Leu Leu Leu Val Val Leu Cys 
1075 1080 1085 

val Oly Ala Cys Asn Ser Leu Leu Val Phe Pro Ho Leu Leu Ser Met 
1090 1095 1100 

Val Gly Pro Glu Ala Olu Leu Val Pro Leu Olu His Pro Asp Ara He 
1105 1110 1115 *^ ^ ^^20 

Ser Thr Pro Ser Pro Leu Pro Val Arg Ser Ser Lys Arg Ser Gly Lvs 
"25 1130 1135 ^ 

Ser Tyr Val Val Gin Gly Ser Arg Ser Ser Arg Gly Ser Cys Gin Lys 
11*0 1145 1150 

Ser His His His His His Lys Asp Leu Asn Asp Pro Ser Leu Thr Thr 
1155 1160 1165 

He Thr Glu Olu Pro Gin Ser Trp Lys Ser Ser Asn Ser Ser He Gin 
1170 1175 1180 

Met Pro Asn Asp Trp Thr Tyr Gin Pro Arg Glu Gin Arg Pro Ala Ser 

1190 1195 1200 

Tyr Ala Ala Pro Pro Pro Ala Tyr His Lys Ala Ala Ala Gin Gin His 
"05 1210 1215 

His Gin His om Gly Pro Pro Thr Thr Pro Pro Pro Pro Phe Pro Thr 
"20 1225 1230 

Ala Tyr Pro Pro Glu Leu Gin Ser He Val Val Gin Pro Glu Val Thr 
1235 1240 1245 

Val Glu Thr Thr His Ser Asp Ser Asn Thr Thr Lys Val Thr Ala Thr 
1250 1255 1260 

Ala Asn He Lye Val Glu Leu Ala Met Pro Gly Arg Ala Val Arg Ser 
^2*^ "70 1275 1280 

Tyr Asn Phe Thr Ser 
1285 

(2) INFORMATION FOR SBQ 10 H0j7» 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 345 baa pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Binglo 

(D) TOPOLOGY: linear 

ii) MOLECULE TYPE: DMA (genomic) 



xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



AAGGTCCATC AGCTTTGGAT 


ACAGGAAGGT 


GGTTCGCTCG 


AGCATGAGCT AGCCTACACG 


60 


CAGAAATCGC 


TCGGCGAGAT 


GGACTCCTCC ACGCACCAGC 


TGCTAATCCA 


AACNCCCAAA 


120 


CATATGGACG 


CCTCGATACT 


GCACCCGAAC 


GCGCTACTGA 


CGCACCTCGA 


CGTGGTGAAG 


180 


AAAGCGATCT 


CGGTGACGGT 


GCACATGTAC 


GACATCACGT 


GGAGNCTCAA 


GGACATGTGC 


240 


TACTCGCCCA 


GCATACCGAG 


NTTCGATACG 


CACTTTATCG 


AGCAGATCTT 


CGAGAACATC 


300 


ATACCGTGCG 


CGATCATCAC 


GCCGCTGGAT 


TGCTTTTGGG 


AGGGA 




345 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 115 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 

Lys Val His Gin Leu Trp He Gin Glu Gly Gly Ser Leu Glu His Glu 



Leu Ala Tyr 



a Tyr Thr Gin Lye Ser Leu Gly Glu Met Aep Ser Ser Thr His 



20 25 30 



cm Leu Leu He Gin Thr Pro Lys Asp Met Asp Ala Ser He Leu His 

35 40 *^ 

Pro Asn Ala Leu Leu Thr His Leu Asp Val Val Lys Lys Ala lie Ser 



so 



ss 



val Thr val His Met Tyr Asp He Thr Trp Xaa Leu Lys Asp Met Cys 
65 70 75 

Tyr ser Pro Ser lie Pro Xaa Phe Asp Thr His Phe He Glu Gin He 



85 
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Phe Glu Asn He He Pr Cys Ala He He Thr Pro Leu Aap Cys Phe 

105 110 

Trp Glu Gly 
115 

(2) INFORMATION PGR SEQ ID NO: 9s 

(i) SEQOENCE CHARACTERISTICS: 

(A) LENGTH: 5187 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECTn:.E TYPE: cDNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GGGTCTGTCA CCCGCAGCCG GAGTCCCCGG CGGCCAGCAG CGTCCTCCCG AGCCGAGCGC 
CCAGGCGCGC CCGGAGCCCG CGCCGGCGCC GGCAACATGC CCTCGGCTGG TAACGCCGCC 
CCGGCCCTGG GCAGGCAGGC CGGCGGCGGG AGGCGCAGAC GGACCGGGGG ACCGCACCGC 
GCCGCGCCGG ACCGGCACTA TCTGCACCGC CCCAGCTACT GCGACGCCGC CTTCGCTCTG 
GAGCAGATTT CCAACGCGAA GCCTACTCGC CGGAAAGCGC CGCTGTGGCT GAGAGCGAAG 
TTTCAGAGAC TCTTATTTAA ACTGGGTTGT TACATTCAAA AGAACTGCGG CAAGTTTTTG 
GTTGTGGGTC TCCTCATATT TGGGGCCTTC GCTGTGGGAT TAAAGCCAGC TAATCTCGAG 
ACCAACGTGC AGGAGCTGTC GGTGGAAGTT GGTGCACGAG TGAGTCGAGA ATTAAATTAT 
ACCCGTCAGA AGATAGGAGA AGAGCCTATG TTTAATCCTC AACTCATGAT ACAGACTCCA 
AAAGAAGAAG GCGCTAATGT TCTGACCACA GAGGCTCTCC TGCAACACCT GGACTCAGCA 
CTCCAGGCCA GTCGTGTCCA OGTCTACATC TATAACAGGC AATGCAAGTT GGAACATTTG 
TGCTACAAAT CAGGGGAACT TATCACGGAG ACAGGTTACA TGGATCAGAT AATAGAATAC 
CTTTACCCTT GCTTAATCAT TACACCTTTG GACTGCTTCT GGGAAGGGGC AAAGCTACAG 
TCCGGGACAG CATACCTCCT AGGTAAGCCT CCTTTACGCT GGACAAACTT TGACCCCTTG 
GAATTCCTAG AAGAGTTAAA GAAAATAAAC TACCAAGTGG ACAGCTGGGA GGAAATGCTG 
AATAAAGCCG AAGTTCGCCA TGGGTACATG GACCGGCCTT GCCTCAACCC ACCCCACCCA 
GATTGCCCTG CCACAGCCCC TAACAAAAAT TCAACCAAAC CTCTTGATGT GGCCCTTCTT 
TTGAATGGTG CATGTCAACG TTTATCCAGG AAGTATATGC ATTGGCAGGA GGAGTTCATT 
GTGGGTCGTA CCGTCAAGAA TCCCACTGGA AAACTTGTCA GCGCTCACGC CCTCCAAACC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



43 



wo 96/11260 



PCTAJS95/13233 



ATGTTCCAGT TAATGACTCC CAAGCAAATG TATGAACACT TCAGGGGCTA CGACTATGTC 
TCTCACATCA ACTGGAATGA AGACAGGGCA GCCGCCATCC TGGAGGCCTG GCAGAGGACT 
TACGTGGAGG TGGTTCATCA AAGTGTCGCC CCAAACTCCA CTCAAAAGGT GCTTCCCTTC 
ACAACCACGA CCCTGGACGA CATCCTAAAA TCCTTCTCTG ATGTCAGTGT CATCCGAGTG 
GCCAGCGCCT ACCTACTGAT CCTTGCCTAT GCCTGTTTAA CCATGCTGCG CTGGGACTGC 
TCCAAGTCCC AGGGTGCCGT GGGGCTGGCT GGCGTCCTGT TGGTTGCGCT GTCAGTGGCT 
GCAGGATTGG GCCTCTGCTC CTTGATTGGC ATTTCTTTTA ATGCTGCGAC AACTCAGGTT 
TTGCCGTTTC TTGCTCTTGG TGTTGGTGTG GATGATGTCT TCCTCCTGGC CCATGCATTC 
AGTGAAACAG GACAGAATAA GAGGATTCCA TTTGAGGACA GGACTGGGGA GTGCCTCAAG 
CGCACCGGAG CCAGCGTGGC CCTCACCTCC ATCAGCAATG TCACCGCCTT CTTCATGGCC 
GCATTGATCC CTATCCCTGC CCTGCGAGCG TTCTCCCTCC AGGCTGCTGT GGTGGTGGTA 
TTCAAXTTTG CTATGGTTCT GCTCATTTTT CCTGCAATTC TCAGCATGGA TTTATACAGA 
CGTGAGGACA GAAGATTGGA TATTTTCTGC TGTTTCACAA GCCCCTGTGT CAGCAGGGTG 
ATTCAAGTTG AGCCACAGGC CTACACAGAG CCTCACAGTA ACACCCGGTA CAGCCCCCCA 
CCCCCATACA CCAGCCACAG CTTCGCCCAC GAAACCCATA TCACTATGCA GTCCACCGTT 
CAGCTCCCCA CAGAGTATGA CCCTCACACG CACGTGTACT ACACCACCGC CGAGCCACGC 
TCTGAGATCT CTGTACAGCC TGTTACCGTC ACCCAGGACA ACCTCAGCTG TCAGAGTCCC 
GAGAGCACCA GCTCTACCAG GGACCTGCTC TCCCAGTTCT CAGACTCCAG CCTCCACTGC 
CTCGAGCCCC CCTGCACCAA GTGGACACTC TCTTCGTTTG CAGAGAACCA CTATGCTCCT 
TTCCTCCTGA AACCCAAAGC CAAGGTTGTG GTAATCCTTC TTTTCCTGGG CTTGCTGGGG 
GTCAGCCTTT ATGGGACCAC CCGAGTGAGA GACGGGCTGG ACCTCACGGA CATTGTTCCC 
CGGGAAACCA GAGAATATGA CTTCATAGCT GCCCAGTTCA AGTACTTCTC TTTCTACAAC 
ATGTATATAG TCACCCAGAA AGCAGACTAC CCGAATATCC AGCACCTACT TTACGACCTT 
CATAAGAGTT TCAGCAATGT GAAGTATGTC ATGCTGGAGG ACAACAAGCA ACTTCCCCAA 
ATGTGGCTGC ACTACTTTAG AGACTGGCTT CAAGGACTTC AGGATGCATT TGACAGTGAC 
TGGGAAACTG GGACGATCAT GCCAAACAAT TATAAAAATG GATCAGATGA CGGGGTCCTC 
GCTTACAAAC TCCTGGTGCA GACTGGCAGC CGAGACAAGC CCATCGACAT TAGTCAGTTG 
ACTAAACAGC GTCTGGTAGA CGCAGATGGC ATCATTAATC CGAGCGCTTT CTACATCTAC 
CTGACCGCTT GGGTCAGCAA CGACCCTGTA GCTTACGCTG CCTCCCAGGC CAACATCCGG 



1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2620 

2880 
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CCTCACCGGC CGGAGTCCCT CCATCACAAA CCC8ACTACA TGCCAGACAC CACGCTGAOA 2940 

ATCCCAGCAG CAGAGCCCAT CGAGTACGCT CAGTTCCCTT TCTACCTCAA CGCCCTACGA 3000 

GACACCTCAG ACTTTGTGGA AGCCATAGAA AAAGTCAGAG TCATCTGTAA CAACTATACG 3060 

AGCCTGCCAC TCTCCACCTA CCCCAATGGC TACCCCTTCC TGTTCTGGGA GCAATACATC 3120 

ACCCTGCGCC ACTGCCTGCT CCTATCCATC AOCCTGGTGC TGGCCTOCAC CTTTCTAGTO 3180 

TGCCCAGTCT TCCTCCTCAA CCCCTGCACO GCCGGCATCA TTGTCATGGT CCTGCCTCTG 3240 

ATGACCGTTC AGCTCTTTGC CATGATOGGC CTCATTGGCA TCAAGCTGAG TGCTGTGCCT 3300 

GTGGTCATCC TCATTGCATC TGTTGGCATC GGAGTGGAGT TCACCGTCCA CGTCGCTTTG 3360 

GCCTTTCTGA CAGCCATTGG GGACAAGAAC CACAGGGCTA TGCTCGCTCT GGAACACATG 3420 

TTTCCTCCCC TTCTGOACGG TGCTGTGTCC ACTCTCCTGO GTGTACTGAT GCTTCC»GGC 3480 

TCCGAATTTG ATTTCATTGT CAGATACTTC TTTGCCOTCC TGGCCATTCT CACCGTCTTG 3540 

GCGGTTCTCA ATGGACT6CT TCTGCTGCCT GTCCTCTTAT CCTTCTTTOG ACCCTGTCCT 3600 

CAGGTGTCTC CAGCCAATGC CCTAAACCGA CTGCCCACTC CTTCGCCTGA GCCGCCTCCA 3660 

AGTCTCGTCC GGTTTGCCOT OCCTCCTGGT CACACGAACA ATGGGTCTGA TTCCTCCGAC 3720 

TCGGAGTACA GCTCTCACAC CACCCTGTCT GCCATCAOTG AGGAGCTCAG CCAATACCAA 3780 

GCACAGCAGG 6TGCC0CA6G CCCTCCCCaVC CAACTGATTG TGOAAGCCAC AGAAAACCCT 3840 

CTCTTTGCCC GGTCCACTCT GGTCCATCCG GACTCCAGAC ATCAGCCTCC CTTCACCCCT 3900 

CGCCAACAGC CCCACCTGGA CTCTGCCTCC TTGTCCCCTG GACCGCAAGG CCAGCACCCT 3960 

CGAAGGGATC CCCCTA0A6A AOCCTTOCGG CCACCCCCCT ACACACCCCG CAOAGAOGCT 4020 

TTTGAAATTT CTACTOAAGC GCaTTCTOGC CCTACCAATA GGGACCGCTC AGGGCCCCGT 4080 

GGGGCCCGTT CTCACAACCC TCGCAACCCA ACGTCC»CCG CCATGGGCAG CTCTGTGCCC 4140 

AGCTACTGCC AGCCCATCAC CACTOTGACXS CCTTCTGCTT CGGTGACTCT TCCT6TGCAT 4200 

CCCCCGCCTG CACCTOOGCG CAACCCOOGA CCOCCGCCCT GTCCAGGCTA TCAGAGCTAC 4260 

CCTGAGACTO ATCACGCGGT ATTTCAGGAT CCTCaVTGTGC CTTTTCATCT CAGGTCTCAG 4320 

AGGAGCGACT CaAAGCTGGA OCTCATAGAG CTACAGGACXS TGGAATGTGA OGAGAGGCCG 4380 

TGGGGCAGCA GCTCCAACTG AGGGTAATTA AAATCT6AAG CAAAGAGGCC AAAGATTGGA 4440 

AAGCCCCGCC CCCACXTTCTT TCCACAACTG CTTGAAGAGA ACTGCTTGGA ATTATGGGAA 4500 

GCCAGTTCAT TGTTACTCTA ACTGATTCTA TTATTKKGTC AAATATTTCT ATAAATATTT 4560 

AARA6GTGTA CACATCTAAT ATACAWGAA ATGCTGTACA CTCTArTTCC TGCGGCCTCT 4620 
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CCACTCCTCC CCCAGAGTGG 


GGAGACCACA GGGGCCCTTT 


CCCCTGTGTA 


CATTGGTCTC 


4680 


TGTGCCACAA CCAAGCTTAA 


CTTAGTTTTA AAAAAAATCT 


CCCAGCATAT 


CTCGCTGCTG 


4740 


CTTAAATATT GTATAATTTA 


CTTGTATAAT TCTATGCAAA 


TATTGCTTAT 


GTAATAGGAT 


4800 


TATTTGTAAA GCTTTCTGTT 


TAAAATATTT TAAATTTGCA 


TATCACAACC 


CTGTGGTAGG 


4860 


ATGAATTGTT ACTGTTAACT 


TTTCAACACG CTATGCGTGG 


TAATTGTTTA ACGAGCAGAC 


4920 


ATGAAGAAAA CAGGTTAATC 


CCAGTGGCTT CTCTAGGGGT AGTTGTATAT 


GGTTCGCATG 


4980 


GGTGGATGTG TGTGTGCATG 


TGACTTTCCA ATGTACTGTA 


TTGTGGTTTG 


TTGTTGTTGT 




TGCTGTTGTT GTTCATTTTG 


GTGTTTTTGG TTCCTTTGTA 


TGATCTTAGC 


TCTGGCCTAG 


5100 


GTG6GCTGGG AAGGTCCAGG 


TCTTTTTCTG TCGTGATGCT 


GGTGGAAAGG 


TGACCCCAAT 


5160 
5187 


CATCTGTCCT ATTCTCTGGG 


ACTATTC 









{2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1434 amino acids 

(B) TVPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

Met Ala ser Ala Gly Asn Ala Ala Gly Ala Leu Gly Arg Gin Ala Gly 
1 5 10 

Gly Gly Arg Arg Arg Arg Thr Gly Gly Pro His Arg Ala Ala Pro Asp 

— 25 

Tyr Cys Asp Ala Ala Phe Ala Leu 
45 

Glu Gin lie ser Lye Gly Lya Ala Thr Gly Arg Lys Ala Pro I*u Trp 



20 25 

Arg Asp Tyr Leu Hie Arg Pro Ser . - 

35 



50 



55 



Phe Oln Arg Leu Leu Phe Lye Leu Gly Cye Tyr lie 



Leu Arg Ala Lye Phe oin Arg l«« 80 



65 



70 



Gin Lys Asn Cy- Gly Lys Phe Leu Val Val Gly Leu Leu He Phe Gly 

85 

Kla Phe Ala Val Gly Leu Lys Ala Ala Asn Leu Glu Thr Asn Val Glu 
100 

Glu Leu Trp Val Glu Val Gly Gly Arg Val Ser Arg Glu Leu Asn Tyr 
115 "0 
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Thr Arg Gin Lys He Gly Glu Glu Ala Met Phe Asn Pro Gin Leu Met 
130 135 140 

lie Gin Thr Pro Lys Glu Glu Gly Ala Asn Val Leu Thr Thr Glu Ala 
145 150 155 160 

Leu Leu Gin His Leu Asp Ser Ala Leu Gin Ala Ser Arg Val His Val 
165 170 175 

Tyr Met Tyr Asn Arg Gin Trp Lye Leu Olu His Leu Cye Tyr Lys Ser 
180 185 190 

Gly Glu Leu He Thr Glu Thr Gly Tyr Met Asp Gin He He Glu Tyr 
195 200 205 

Leu Tyr Pro Cys Leu He He Thr Pro Leu Asp cys Phe Trp Glu Gly 
210 215 220 

Ala Lys Leu Gin Ser Gly Thr Ala Tyr Leu Leu Gly Lys Pro Pro Leu 
225 230 235 240 

Arg Trp Thr Asn Phe Asp Pro Leu Glu Phe Leu Glu Glu Leu Lys Lys 
245 250 255 

He Asn Tyr Gin Val Asp Ser Trp Glu Glu Met Leu Asn Lys Ala Glu 
260 265 270 

Val Gly Hie Gly Tyr Met Asp Arg Pro Cys Leu Asn Pro Ala Asp Pro 
275 280 285 

Asp Cys Pro Ala Thr Ala Pro Asn Lys Asn Ser Thr Lys Pro Leu Asp 
290 295 300 

Val Ala Leu Val Leu Asn Gly Gly Cys Gin Gly Leu Ser Arg Lys Tyr 
305 310 315 320 

Met His Trp Gin Glu Glu Leu He Val Gly Gly Thr Val Lys Asn Ala 
325 330 335 

Thr Gly Lys Leu Val Ser Ala His Ala Leu Gin Thr Met Phe Gin Leu 
340 345 350 

Met Thr Pro Lye Gin Met Tyr Clu His Phe Arg Gly Tyr Asp Tyr Val 
355 360 365 

Ser His He Asn Trp Asn Glu Asp Arg Ala Ala Ala He Leu Glu Ala 
370 375 380 

Trp Gin Arg Thr Tyr Val Glu Val Val His Gin Ser Val Ala Pro Asn 
385 390 395 400 

Ser Thr Gin Lys Val Leu Pro Phe Thr Thr Thr Thr Leu Asp Asp He 
405 410 415 



Leu Lys Ser Phe Ser Asp Val Ser Val He Arg Val Ala Ser Gly Tyr 
420 425 430 

Leu Leu Met Leu Ala Tyr Ala Cys Leu Thr Met Leu Arg Trp Asp Cys 
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435 



440 445 



ser Lys Ser Gin Gly Ala Val Gly Leu Ala Gly Val Leu Leu Val Ala 
450 455 460 

Leu ser Val Ala Ala Gly Leu Gly Leu Cys Ser Leu lie Gly lie Ser 
465 470 475 480 

Phe ABti Ala Ala Thr Thr Gin Val Leu Pro Phe Leu Ala Leu Gly Val 
485 490 495 

Clv val Asp Aap val Phe Lou Leu Ala His Ala Phe Ser Glu Thr Oly 
500 505 510 

Gin Aan Lys Arg He Pro Phe Glu Aap Arg Thr Gly Glu Cya Leu Lys 
SIS 520 525 

Ara Thr Glv Ala Ser Val Ala Leu Thr Ser He Ser Asn Val Thr Ala 
530 535 540 

Phe Phe Met Ala Ala Leu lie Pro He Pro Ala Leu Arg Ala Phe Ser 
545 550 555 560 

Leu Gin Ala Ala Val Val Val Val Phe Aan Phe Ala Met Val Leu Leu 
565 570 575 

He Phe Pro Ala He Leu Ser Met Asp Leu Tyr Arg Arg Glu Asp Arg 
580 585 590 

Arg Leu Asp He Phe cys Cys Phe Thr Ser Pro Cys Val Ser Arg Val 
595 600 605 

He Gin Val Glu Pro Oln Ala Tyr Thr Glu Pro His Ser Asn Thr Arg 
610 615 620 

Tvr Ser Pro Pro Pro Pro Tyr Thr Ser His Ser Phe Ala His Glu Thr 
625 630 635 640 

His He Thr Met Gin Ser Thr Val Gin Leu Arg Thr Glu Tyr Asp Pro 
645 650 655 

His Thr His Val Tyr Tyr Thr Thr Ala Glu Pro Arg Ser Glu He Ser 
660 665 670 

Val Gin Pro Val Thr Val Thr Gin Asp Asn Leu Ser Cys Gin Ser Pro 
675 680 685 

Glu ser Thr Ser Ser Thr Arg Aap Leu Leu Ser Gin Phe Ser Asp Ser 
690 695 700 

Ser Leu His Cys Leu Glu Pro Pro Cys Thr Lys Trp Thr Leu Ser Ser 
705 710 715 720 

Phe Ala Glu Lys His Tyr Ala Pro Phe Leu Leu Lys Pro Lys Ala Lys 
725 730 735 

Val Val Val He Leu Leu Phe Leu Gly Leu Leu Gly Val Ser Leu Tyr 
740 745 750 
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Gly Thr Thr Arg Val Arg Asp Gly Leu Asp Lau Thr Asp He Val Pro 
755 760 

Arg Glu Thr Arg Glu Tyr Asp Phe He Ala Ala Gin Phe Lys Tyr Phe 
770 775 780 

Sor Phe Tyr Asn Met Tyr He Val Thr Gin Lys Ala Asp Tyr Pro Asn 

'90 795 aoo 

He Gin His Leu Leu Tyr Asp Lau His Lys Ser Phe Ser Asn Val Lys 
805 810 815 

Tyr Val Met Leu Glu Glu Asn Lys Gin Leu Pro Gin Met Trp Leu His 
820 825 830 

Tyr Phe Arg Asp Trp Leu Gin Gly Leu Gin Asp Ala Phe Asp Ser Asd 
835 840 845 

Trp Glu Thr Gly Arg He Met Pro Asn Asn Tyr Lys Asn Gly Ser Asp 
850 855 860 

Asp Gly val Leu Ala Tyr Lys Leu Leu Val Gin Thr Gly Ser Arg Asp 
^" 8"'0 875 880 

Lys Pro He Asp He Ser Gin Leu Thr Lys Gin Arg Leu Val Asp Ala 
885 890 895 

Asp Gly He He Asn Pro Ser Ala Phe Tyr He Tyr Leu Thr Ala Trp 
900 905 910 

Val Ser Asn Asp Pro Val Ala Tyr Ala Ala Ser Gin Ala Asn He Aro 
915 920 

Pro His Arg Pro Glu Trp Val His Asp Lys Ala Asp Tyr Met Pro Glu 
930 935 

Thr Arg Leu Arg He Pro Ala Ala Glu Pro He Glu Tyr Ala Gin Phe 

950 955 9g(j 

Pro Phe Tyr Leu Asn Gly Leu Arg Asp Thr Ser Asp Phe Val Glu Ala 
965 970 

He Glu Lys Val Arg Val He Cys Asn Asn Tyr Thr Ser Leu Gly Leu 
980 985 990 

Ser Ser Tyr Pro Asn Gly Tyr Pro Phe Leu Phe Trp Glu Gin Tyr He 
995 1000 1005 

ser Leu Arg His Trp Leu Lau Leu Ser He Ser Val Val Leu Ala cvs 
1010 1015 1020 

Thr Phe Leu Val Cys Ala Val Phe Leu Leu Asn Pro Trp Thr Ala Glv 

1030 1035 loio 

He He Val Met Val Lou Ala Leu M t Thr Val Glu Leu Phe Gly Met 
1045 1050 1055 

Met Gly Leu He Gly He Lys Leu Ser Ala Val Pro Val Val He Leu 
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1060 



1065 1070 



lie Ala Ser Val Cly He Gly Val Glu Phe Thr Val Hie Val Ala Leu 
1075 1080 1085 

Ala Phe Leu Thr Ala He Gly Asp Lye Asn His Arg Ala Met Leu Ala 
1090 1095 1100 

Leu Glu Hia Met Phe Ala Pro Val Leu Asp Gly Ala Val Ser Thr Leu 
1105 1110 ^^^^ 

Leu Gly val Leu Met Leu Ala Gly Ser Glu Phe Asp Phe He Val Arg 
' 1125 1130 1135 



Tyr Phe Phe Ala Val Leu Ala He Leu Thr Val Leu Gly Val Leu Aen 
1140 11*5 1150 

Gly Leu val Leu Leu Pro val Leu Leu Ser Phe Phe Gly Pro Cys Pro 
' 1155 1160 1165 

Glu val ser Pro Ala Asn Gly Leu Asn Arg Leu Pro Thr Pro Ser Pro 
1170 1175 1180 

Glu pro Pro pro Ser Val Val Arg Phe Ala Val Pro Pro Gly His Thr 



1185 



1190 1195 1200 



Asn Asn Gly Ser Asp Ser Ser Asp Ser Glu Tyr Ser Ser Gin Thr Thr 
1205 1210 121^ 

val ser Gly lie Ser Glu Glu Leu Arg Gin Tyr Glu Ala Gl" ^ly 
1220 1225 1230 

Ala Gly Gly Pro Ala His Gin Val He Val Glu Ala Thr Glu Asn Pro 
1235 1240 1245 

val Phe Ala Arg Ser Thr Val Val His Pro Asp Ser Arg His Gin Pro 
1250 1255 1260 



Pro Leu Thr Pro Arg Gin Gin Pro His Leu Asp Ser Gly Ser Leu Ser 
1265 1270 1275 1280 

Pro Gly Arg Gin Gly Gin Gin Pro Arg Arg Asp Pro Pro Arg Glu Gly 
1285 1290 

Leu Arg Pro Pro Pro Tyr Arg Pro Arg Arg Asp Ala Phe Glu He Ser 
1300 1305 1310 

Thr Glu Gly His ser Gly Pro Ser Asn Arg Asp Arg Ser Gly Pro Arg 
1315 1320 1325 

Gly Ala Arg Ser His Asn Pro Arg Asn Pro Thr Ser Thr Ala Met Gly 
1330 1335 1340 

ser ser Val Pro Ser Tyr Cys Gin Pro He Thr Thr val Thr Ala Ser 
1345 1350 1355 

Ala ser Val Thr Val Ala Val His Pro Pro Pro Gly Pro Gly 

1365 1370 13'» 
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Pro Arg Gly Gly Pro Cye Pro Gly Tyr Glu Ser Tyr Pro Glu Thr Aep 
1380 1385 ^ 



1390 



His Gly yal Phe Glu Asp Pro His Val Pro Phe His Val Arg Cye Glu 



1400 i405 



mo"^^^ ^^"^ ""'^ xiJs''*^ cyfi 



1420 



Glu Glu Arg Pro Trp Gly Ser Ser Ser Asn 
1425 1430 

(2) INFORHATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS J 

(A) LENGTHS 11 amino acids 

(B) TYPE: amino acid 

(C) STRANOEONESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
He He Thr Pro Leu Asp Cys Phe Trp Glu Gly 

(2) INFORHATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Leu He Val Gly Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) 5TRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

pro Phe Phe Trp Glu Gin Tyr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 14 J 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 baae pairs 

(B) TYPE: nucleic acid 

(C) STRAMDEDNESS: single 

(D) TOPOUXSY: linear 

(ii) MOLECULE TYPE! Other nucleic acid 
(A) DESCRIPTION: /desc - "priner" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGACGAATTC AARGTNCAYC ARYTNTGC 
<2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 baee pairs 

(B) TYPE: nucleic acid 
<C) STRAMDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid^ 
(A) DESCRIPTION: /desc - "primer 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 
GGACGAATTC CYTCCCARAA RCANTC 
(2) INFORMATION FOR SEQ ID NO: 16 J 

(i) SEQUENCE CHARACTERISTICSs 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRAMDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid^ 
(A) DESCRIPTION: /desc " "primer 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
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GGACGAATTC yTKCAKTCrT TYT6GGA 
(2) INFORMATION TOR SEQ ID N0:17; 

<1) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 31 base pairs 

(B) TYPE I nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc « "primer** 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CATACCAGCC AAGCTTGTCN GGCCARTGCA T 
(2) INFORMATION TOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5288 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

GAATTCCGGG GACCGCAAGG ACTGCCGCGG AAGCCCCCGA AGGACACGCT CGCTCGGCGC 60 

GCCGCCTCTC CCTCTTCCGC GAACTGGATG T6GGCA0CGC CGGCCGCAGA CACCTCGGGA 120 

CCCCCGCGCA ATGTGGCAAT GGAACGCGCA GCGTCT6ACT CCCCGGCAGC GGCCGCGGCC 180 

CCA6CGGCAG CAGCGCCCGC OGTGTGAGCA GCAGCACCGG CTGGTCTGTC AACCGGAGCC 240 

CGAGCCCGAG CAGCCTGCGG CCAGCAGCGT CCTCGCAAGC CGAGCGCCCA GGOGCGCCAG 300 

GAGCCCGCAG CAGCGGCAGC AGCGCGCCGG GCCGCCCGGG AAGCCTCCGT CCCOGCGGCG 360 

GCCCCGGCGG CGGCGGCGGC AACATGGCCT CCGCTGCTAA CGCCGCCGAG CCCCAGCACC 420 

GCGGCGGCGG CGGCAGCGGC TGTATCGGTC CCCCGGGACG GCCCGCTGGA GGCGGGAGGC 480 

GCAGACGGAC GGGGCCGCTG CGCOSTGCTG CCCCGCCGGA CCGGGACTAT CTGCACCGGC 540 

CCAGCTACTG CGACGCCGCC TTCGCTCTOC ACCAGATTTC CAAGGGGAAG GCTACTCCCC 600 

GGAAAGCGCC ACTGTCCCTG AGAGCGAAGT TTCAGAGACT CTTATTTAAA CTGGGTTGTT 660 

ACATTCAAAA AAACTGCGGC AAGTTCTTGG TTCTGCGCCT CCTCATATTT GGCGCCTTCG 720 
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COCTOCGATT AAAAGCAGCO AACCTCGAGA CCAACGTGGA GGAGCTGTCO GTGGAAGTTG 780 

GAGGACCAGT AAGTCGTGAA TTAAATTATA CTCGCCAGAA GATTCGAGAA GAGGCTATGT 840 

TTAATCCTCA ACTCATGATA CAGACCCCTA AAGAAGAAGO TGCTAATGTC CTGACCACAG 900 

AAGCGCTCCT ACAACACCTO GACTCGOCAC TCCAGGCCAG CCGTGTCCAT GTATACATGT 960 

ACAACAGCCA GTGGAAATTG GAACATTTGT GTTACAAATC AGGAGAGCTT ATCACAGAAA 1020 

CAGCTTACAT GGATCAGATA ATAGAATATC TTTACCCTTG TTTGATTATT ACACCTTTGG 1080 

ACTGCTTCTC GGAAGGGGCG AAATTACAGT CTGGGACAGC ATACCTCCTA GGTAAACCTC 1140 

CTTTGCGGTG GACAAACTTC GACCCTTTGG AATTCCTGGA AGAGTTAAAG AAAATAAACT 1200 

ATCAAGTGGA CAGCTGGGAO GAAATGCTGA ATAAGGCTGA GGTTGGTCAT GGTTACATGG 1260 

ACCGCCCCTG CCTCAATCCG GCCGATCCAG ACTGCCCCGC CACAGCCCCC AACAAAAATT 1320 

CAACCAAACC TCTTGATATG GCCCTTGrTT TCAATGGTGC ATGTCATGGC TTATCCAGAA 1380 

AGTATATGCA CTGGCAGGAG GA6TTGATTG TGGCTGGCAC AGTCAAGAAC AGCACTGGAA 1440 

AACTCGTCAG CGCCCATGCC CTGCACACCA TGTTCCyVGTT AATGACTCCC AAGCAAATGT 1500 

ACGAGCACTT CAAGGGGTAC GAGTATGTCT CACACATCAA CTGGAACGAG GACAAAGCGG 1560 

CAGCCATCCT GGAGGCCTGG CAGAGGACAT ATGTGGAGGT GGTTCATCAO AGTCTCGCAC 1620 

AGAACTCCAC TCAAAAGGTG CTTTCCTTCA CCACCACGAC CCTGGACGAC ATCCTGAAAT 1680 

CCTTCTCTGA CGTCAGTCTC ATCCGCGTGC CCAGCGGCTA CTTACTCATG CTCGCCTATG 1740 

CCTGTCTAAC CATGCTGCGC TGGGACT6CT CCAAGTCCCA GCGTGCCGTG GGGCTGGCTC 1800 
GCGTCCTGCT GGTTGCaCTG TCAGTGGCTG CAGGACTGGG CCTGTGCTCA TTGATCGCAA 
TTTCCTTTAA CGCTGCAACA ACTCAGGTTT TGCCATTTCT CGCTCTTCGT GTTGGTGTGG 

ATGATGTTTT TCTTCTGGCC CACGCCTTCA GTGAAACAGG ACAGAATAAA AGAATCCCTT 1980 
TTGACGACAG GACCGGGGAG TGCCTGAACC GCACAGGACC CAGCGTGGCC CTCACGTCCa 
TCAGCAATGT CACAGCCTTC TTCATCGCCG CGTTAATCCC AATTCCCGCT CTGCGGGC6T 
TCTCCCTCCA GGCAGCGGTA GTAGTCGTGT TCAATTTTGC CATGCTTCTG CTCATTTTTC 
CTGCAATTCT CAGCATGGAT TTATATCGAC GCGAGGACAG GAGACTGGAT ATTTTCTGCT 

GTTTTACAAG CCCCTGCGTC AGCAGAGTCA TTCAGGTTGA ACCTCAGGCC TACACCGAC» 2280 
CACACGACAA TACCCGCTAC AGCCCCCCAC CTCCCTACAG CAGCCACAOC TTTGCCCATG 
AAACGCAGAT TACCRTGCAG TCCACTGTCC AGCTCCGCAC GGAGTACGAC CCCCACACGC 

ACGTGTACTA CACCACCGCT 6AGCCGCGCT CCGA6ATCTC TGTGCAGCCC GTCACCGTGA 2460 



1860 
1920 



2040 
2100 
2160 
2220 



2340 
2400 
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CACAGGACAC CCTCAGCTGC CAGAGCCCAG AGAGCACCAG CTCCACAAG6 GACCTGCTCT 2520 

CCCAGTTCTC CGACTCCAGC CTCCACTGCC TCGAGCCCCC CTGTACX3AAG TGGACACTCT 2580 

CATCTTTTGC TGAGAAGCAC TATGCTCCTT TCCTCTTCAA ACCAAAAGCC AAGGTAGTGG 2640 

TGATCTTCCT TTTTCTGGGC TTGCTGGGGC TCAGCCTTTA TGGCACCACC CGAGTGAGAG 2700 

ACGGGCTGGA CCTTACGGAC ATTGTACCTC GGGAAACCAG ACAATATGAC TTTATTGCTG 2760 

CACAATTCAA ATACTTTTCT TTCTACAACA TGTATATACT CACCCAGAAA GCAGACTACC 2820 

CGAATATCCA GCACTTACTT TACGACCTAC ACAGGAGTTT CAGTAACGTG AAGTATGTCA 2880 

TGTTCGAAGA AAACAAACAG CTTCCCAAAA TGTGGCTGCA CTACTTCAGA GACTGGCTTC 2940 

AGGGACTTCA GGATGCATTT GACAGTGACT GGGAAACCGG GAAAATCATG CCAAACAATT 3000 

ACAAGAATGG ATCAGACGAT GGAGTCCTTG CCTACAAACT CCTGGTGCAA ACCGGCAGCC 3060 

GCGATAAGCC CATCGACATC AGCCAGTTGA CTAAACAGCG TCTGGTGGAT GCAGATCCCA 3120 

TCATTAATCC CAGCGCTTTC TACATCTACC TGACGGCTTG GGTCAGCAAC GACCCCGTCG 3180 

CGTATGCTGC CTCCCAGCCC AACATCCGGC CACACCGACC AGAATGGGTC CACGACAAAG 3240 

COGACTACAT GCCTGAAACA AGGCTGAGAA TCCCGGCAGC AGAGCCCATC GAGTATGCCC 3300 

AGTTCCCTTT CTACCTCAAC GGGTTGCGGG ACACCTCAGA CTTTGT6GAG CCAATTCAAA 3360 

AAGTAAGGAC CATCTGCAGC AACTATACCA GCCTGGGGCT GTCCAGTTAC CCCAACGGCT 3420 

ACCCCTTCCT CTTCTGGGAG CAGTACATOG GCCTCCGCCA CTGGCTGCTG CTGTTCATCA 3480 

GCGTGGTGTT GGCCTGCACA TTCCTCCTGT GCGCTGTCTT CCTTCTGAAC CCCTGGACGG 3540 

CCGGGATCAT TGTGATGGTC CTGGOGCTGA TCACCGTCGA GCTGTTCGGC ATGATGGGCC 3600 

TCATCGGAAT CAAGCTCAGT GCCGTGCCOG TCCTCATCCT GATCGCTTCT GTTGGCATAG 3660 

GAGTGGAGTT CACCGTTCAC GTTGCTTTGG CCTTTCTCAC GGCCATOGGC GACAAGAACC 3720 

GCAGGGCTGT GCTTGCCCTG GAGCACATGT TTGCACCCGT CCTGGATGGC GCCGTGTCCA 3780 

CTCTGCTGGG AGTGCTGATG CTGGCGGGAT CTGAGTTCGA CTTCATTGTC AGGTATTTCT 3840 

TTGCTGTGCT GGCGATCCTC ACCATOCTCG GOGTTCTCAA TGGGCTGCTT TTGCTTCCCG 3900 

TGCTTTTGTC TTTCTTTGGA CCATATCCTG AGCTGTCTCC AGCCAACGGC TTGAACCGCC 3960 

TGCCCACACC CTCCCCTGAG CCACCCCCCA GOGTGGTCCG CTTCGCCATG CCGCCCGGCC 4020 

ACACGCACAG OGGGTCTGAT TCCTCCGACT CGGAGTATAG TTCCCAGACG ACAGTGTCAG 4080 

GCCTCAGCGA GGAGCTTCGG CACTACGAGG CCCAGCAGGG CGCCGGAGGC CCTGCCCACC 4140 

AAGTGATCGT GGAAGCCACA GAAAACCCCG TCTTCGCCXA CTCCACTGTG GTCCATCCOG 4200 
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AATCCAGCCA TCHCCCACCC TCGAACCOGA CACAGCACCC CCACCTGCAC TCACCCTCCC 
TGCCTCCCGG ACGGCAAGGC CACCACCCCC GCAGGGACCC CCCCAGAGAA GGCTTGTGGC 
CACCCCTCTA CAGACCGCGC AGACAC6CTT TTGAAATTTC TACTCAAGGG CATTCTCGCC 
CTAGCAATAG GGCCCCCTGG GGCCCTCGCG CGGCCCGTTC TCACAACCCT CGGAACCCAG 
CGTCCACTGC CATGGGCAGC TCCXSTCCCCG GCTACTGCCA GCCCATCACC ACTGTGACGG 
CTTCTGCCTC CGTGACTGTC GCCCTGCACC CGCCGCCTGT CCCTGGGCCT GCGCGGAACC 
CCCGAGGCGG ACTCTGCCCA GGCTACCCTG AGACTGACCA CGGCCTGTTT GAGGACCCCC 
ACGTGCCTTT CCACGTCCGG TGTGAGACGA GGGATTCGAA GCTGGAAGTC ATTGAGCTGC 
MGACGTCGA ATGCGAGGAG AGGCCCCGCC GAAGCAGCTC CAACTGAGGG TGATTAAAAT 
CICAAGCAAA GAGGCCAAAG ATTGGAAACC CCCCACCCCC ACCTCTTTCC AGAACTCCTT 
GAAGAGAACT GGTTGGAGTT ATGGAAAAGA TGCCCTGTGC CAGGACAGCA CTTCATTGTT 
ACTGTAACCG ATTGTATTAT TTTCTTAAAT ATTTCTATAA ATATTTAAGA CATGTACACA 
TGTGTAATAT AGGAAGGAAC GATGTAAAGT GGTATGATCT GGCGCTTCTC CACTCCTGCC 
CCAGAGTGTO GACGCCACAG TGGGGCCTCT CCGTATTTGT GCATTGGGCT CCGTGCCACA 
ACCAAGCTTC ATTAGTCTTA AATTTCAGCA TATGTTGCTG CTGCTTAAAT ATTGTATAAT 
TTACTTCTAT AATTCTATCC ARATATTCCT TATGTAATAG GATTATTTTG TAAAGGTTTC 
TGTTTAAAAT ATPTTAAATT TCCATATCAC AACCCTGTGG TAGTATGAAA TGTTACTGTT 
AACTTTCAAA CACGCTATGC GTGATAATTT TTTTGTTTAA TGAGCAGATA TGAAGAAAGC 
CCGGAATT 

(2) INFORMATIOH FOR SEQ ID NO* 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1447 amino acids 

(B) TYPE: amino acid 

(C) STRANDBDNESS: single 

(D) TOPOLCXJy: linear 

(ii) MOLECULE TYPE: protein 



4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5288 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Ala ser Ala Gly Asn Ala Ala Glu Pro Gin Asp Arg Gly Gly Gly 
1 5 10 

Gly ser Gly Cys He Gly Ala Pr Gly Arg Pro Ala Gly ly Gly Arg 
20 25 
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Arg Arg Arg Thr Gly Gly Leu Arg Arg Ala Ala Ala Pro Asp Arg Aep 
35 40 45 

Tyr Leu Hia Arg Pro Ser Tyr Cys Asp Ala Ala Phe Ala Leu Glu Gin 
50 55 60 

He Ser Lys Gly Lys Ala Thr Gly Arg Lys Ala Pro Leu Trp Leu Aro 
65 70 75 80 

Ala Lys Phe Gin Arg Leu Leu Phe Lys Leu Gly Cys Tyr He Gin Lys 
85 90 95 

Asn Cys Gly Lys Phe Leu Val Val Gly Leu Leu He Phe Gly Ala Phe 
100 105 110 

Ala Val Gly Leu Lys Ala Ala Asn Leu Glu Thr Asn Val Glu Glu Leu 
lis 120 125 

Trp Val Glu Val Gly Gly Arg Val Ser Arg Glu Leu Asn Tyr Thr Aro 
130 135 140 

Gin Lys He Gly Glu Glu Ala Met Phe Asn Pro Gin Leu Met He Gin 

150 155 160 

Thr Pro Lys Glu Glu Gly Ala Asn Val Leu Thr Thr Glu Ala Leu Leu 
165 170 175 

Gin His Leu Asp Ser Ala Leu Gin Ala Ser Arg Val His Val Tyr Met 
180 185 190 

Tyr Asn Arg Gin Trp Lys Leu Glu His Leu Cys Tyr Lys Ser Gly Glu 
195 200 205 

Leu He Thr Glu Thr Gly Tyr Met Asp Gin He He Glu Tyr Leu Tvr 
210 215 220 

Pro Cys Leu He He Thr Pro Leu Asp Cys Phe Trp Glu Gly Ala Lys 
225 230 235 240 

Leu Gin Ser Gly Thr Ala Tyr Leu Leu Gly Lys Pro Pro Leu Arg Trp 
245 250 255 

Thr Asn Phe Asp Pro Leu Glu Phe Leu Glu Glu Leu Lys Lys He Asn 
260 265 270 

Tyr Gin Val Asp Ser Trp Glu Glu Met Leu Asn Lys Ala Glu Val Gly 
275 280 285 

His Gly Tyr Met Asp Arg Pro Cys Leu Asn Pro Ala Asp Pro Asp Cys 
290 295 300 



Pro Ala Thr Ala Pro Asn Lys Asn Ser Thr Lys Pro Leu Asp Met Ala 
305 310 315 320 

Leu Val Leu Asn Gly Gly Cys His Gly Leu Ser Arg Lys Tyr Met His 
325 330 335 

Trp Gin Glu Glu Leu He Val Gly Gly Thr Val Lys Asn Ser Thr Gly 
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345 350 

340 



.ys Leu V.I Ser Ma His Ala Leu Oln Thr Met Phe Oln Leu He. Thr 
355 

P„ I,,. «n M« »" '"^ =^0 

370 

T«« Ala Ala Ala He Leu Glu Ala Trp Gin 
He Asn Trp Asn Glu Asp Lys Ala Aia aj. 

385 390 

Thr Tyr Val Glu Val Val His Gin Ser Val Ala Cln Asn Ser Thr 

405 

Cln Lys val Leu Ser Phe Thr Thr Thr Thr Leu Asp Asp lie Leu Lys 
420 

* ser val He Arg Val Ala Ser Gly Tyr Leu Leu 

Ser Phe Ser Asp Val Ser vai xxe ni.y 

435 

Ala Cya Leu Thr Met Leu Arg Trp Asp Cys Ser Lye 



455 



460 



Met Leu Ala Tyr 
450 

S.r .in Gl, Al. V.1 =1, =ly V.1 L.U L.u V.l «. «r 

465 "0 * 

,.1 »I. Ol, Le. 0.y cy. s„ ^ lie =1, II. s.r Jh. 

485 

M. «. Thr Thr Oln V.1 ^ Pro Phe Ma ^ oly OXy v.l 



500 



«P v.1 Phe »1. HI. .1. Ph. ser =lu Thr cly cln »n 

515 

Lys Arg lie Pro Phe Glu Asp Arg Thr Gly Glu Cys Leu Lys Arg Thr 

530 

Oly M. s.r v.1 »1. Thr s.r 11. s.r »n v.l Thr .1. Ph. Ph. 

545 

M. »1. L.. XI. P" »• P" "° 

565 

»1, «. val val v.1 v.1 Ph. A.n Ph. Ala Met v.1 L.u Lev II. Ph. 

580 

pro Al. U. s.r «.t A.P ^ Tyr Ar, Ar, =1» A.p Ar, »u 

595 

„p 11. Ph. cy. C. Ph. Thr s.r Pro cy. val s.r Ar, v.1 U. oln 

610 

val Glu pro Gin Ala Tyr Thr Asp Thr His Asp Asn Thr Arg Tyr Ser 
625 

pro Pro pro pro Tyr 5« Ser Bi. s.r Phe Al. Hi. clo Thr Oln II 



645 
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Thr M t Gin sor Thr Val Gin Leu Arg Thr Glu Tyr Asp Pro His Thr 
660 665 670 

His Val Tyr Tyr Thr Thr Ala Glu Pro Arg Ser Glu He Ser Val Gin 
675 680 685 

Pro Val Thr Val Thr Gin Asp Thr Leu Ser Cys Gin Ser Pro Glu Ser 
690 695 700 

Thr Ser Ser Thr Arg Asp Leu Leu Ser Gin Phe Ser Asp ser Ser Leu 
705 710 715 720 

His eye Leu Glu Pro Pro Cys Thr Lya Trp Thr Leu Ser Ser Phe Ala 
725 730 735 

Glu Lys His Tyr Ala Pro Phe Leu Leu Lys Pro Lys Ala Lys Val Val 
740 745 750 

Val He Phe Leu Pho Leu Gly Leu Leu Gly Val Ser Leu Tyr Gly Thr 
755 760 765 

Thr Arg Val Arg Asp Gly Leu Asp Leu Thr Asp He Val Pro Arg Glu 
770 775 780 

Thr Arg Glu Tyr Asp Phe He Ala Ala Gin Phe Lys Tyr Phe Ser Phe 
785 790 795 800 

Tyr Asn Met Tyr He Val Thr Gin Lys Ala Asp Tyr Pro Asn He Gin 
805 810 815 

His Leu Leu Tyr Asp Leu His Arg Ser Phe Ser Asn Val Lys Tyr Val 
820 825 830 

Met Leu Glu Glu Asn Lys Gin Leu Pro Lys Met Trp Leu His Tyr Phe 
835 840 845 

Arg Asp Trp Leu Gin Gly Leu Gin Asp Ala Phe Asp Ser Asp Trp Glu 
850 855 860 

Thr Gly Lys He Met Pro Asn Aan Tyr Lys Asn Gly Ser Asp Asp Gly 
865 870 875 880 

Val Leu Ala Tyr Lys Leu Leu Val Gin Thr Gly Ser Arg Asp Lys Pro 
885 890 895 

He Asp He Ser Oln Leu Thr Lys Gin Arg Leu Val Asp Ala Asp Gly 
900 905 910 

He He Asn Pro Ser Ala Phe Tyr He Tyr Leu Thr Ala Trp Val Ser 
915 920 925 

Asn Asp Pro Val Ala Tyr Ala Ala Ser Gin Ala Asn He Arg Pro His 
930 935 940 

Arg Pro Glu Trp Val His Asp Lys Ala Asp Tyr Met Pro Glu Thr Arg 
945 950 955 960 



Leu Arg He Pro Ala Ala Glu Pr He Glu Tyr Ala Gin Phe Pro Phe 
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965 970 975 

Tyr Leu Asn Gly Lou Arg Asp Thr ser Asp Phe Val Glu Ala He Glu 
980 985 990 

Lys val Arg Thr He Cys Ser Asn Tyr Thr Ser Leu Gly Leu Ser Ser 
995 1000 1005 

Tyr pro Asn Gly Tyr Pro Phe Leu Phe Trp Glu Gin Tyr He Gly Leu 
1010 1015 ^°20 

Aro His Trp Leu Leu Leu Phe He Ser Val Val Leu Ala Cya Thr Phe 
10i5 1030 1035 1040 

Leu Val cys Ala Val Phe Leu Leu Asn Pro Trp Thr Ala Gly He He 
1045 1050 1055 

Val Met Val Leu Ala Leu Met Thr Val Glu Leu Phe Gly Met Met Gly 
1060 1065 1070 

Leu He Gly He Lye Leu Ser Ala Val Pro Val Val He Leu He Ala 
1075 1080 1085 

ser Val Gly He Gly Val Glu Phe Thr Val His Val Ala Leu Ala Phe 
1090 1095 1100 

Leu Thr Ala He Gly Asp Lys Asn Arg Arg Ala Val Leu Ala Leu Glu 
1105 1110 11" 

His Met Phe Ala Pro Val Leu Asp Gly Ala Val Ser Thr Leu Leu Gly 
1125 1130 1135 

Val Leu Met Leu Ala Gly Ser Glu Phe Asp Phe He Val Arg Tyr Phe 
1140 1145 1150 

Phe Ala Val Leu Ala He Leu Thr He Leu Gly Val Leu Asn Gly Leu 
1155 1160 1165 

Val Leu Leu Pro Val Leu Leu Ser Phe Phe Gly Pro Tyr Pro Glu Val 
1170 1175 1180 

Ser Pro Ala Asn Gly Leu Asn Arg Leu Pro Thr Pro Ser Pro Glu Pro 
11B5 1190 1195 1200 

Pro Pro Ser Val Val Arg Phe Ala Met Pro Pro Gly His Thr His Ser 
1205 1210 1215 

Gly Ser Asp Ser Ser Asp Ser Glu Tyr Ser Ser Gin Thr Thr Val Ser 
1220 1225 1230 

Gly Leu ser Glu Glu Leu Arg His Tyr Glu Ala Gin Gin Gly Ala Gly 
1235 1240 1245 

Gly Pro Ala Hie Gin Val He Val Glu Ala Thr Glu Asn Pro Val Phe 
1250 1255 1260 

Ala His Ser Thr Val Val His Pro Glu Ser Arg His His Pr Pr S r 
126S 1270 1275 1280 
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Asn Pro Arg Gin Gin Pro His Leu Asp Ser Gly Ser Leu Pro Pro Gly 
1285 1290 1295 

Arg Gin Gly Gin Gin Pro Arg Arg Asp Pro Pro Arg Glu Gly Leu Trp 
1300 I30S 1310 

Pro Pro Leu Tyr Arg Pro Arg Arg Asp Ala Phe Glu lie Ser Thr Glu 
1315 1320 1325 

Gly His Ser Gly Pro Ser Asn Arg Ala Arg Trp Gly Pro Arg Gly Ala 
1330 1335 1340 

Arg Ser His Asn Pro Arg Asn Pro Ala Ser Thr Ala Met: Gly Ser Ser 
1345 1350 1355 1360 

Val Pro Gly Tyr Cys Gin Pro lie Thr Thr Val Thr Ala Ser Ala Ser 
1365 1370 1375 

Val Thr Val Ala Val His Pro Pro Pro Val Pro Gly Pro Gly Arg Asn 
1380 1385 1390 

Pro Arg Gly Gly Leu Cys Pro Gly Tyr Pro Glu Thr Asp His Gly Leu 
1395 1400 1405 

Phe Glu Asp Pro His Val Pro Phe His Val Arg Cys Glu Arg Arg Asp 
1410 1415 1420 

Ser Lys Val Glu Val He Glu Leu Gin Asp Val Glu Cys Glu Glu Arg 
1425 1430 1435 1440 

Pro Arg Gly Ser Ser Ser Asn 
1445 
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1 . A DNA sequence other than present in a chromosome encoding a patched gene 
other than the Drosophila patched gene or fragment thereof of at least about I2bp 

5 different from the sequence of the Drosophila patched gene. 

2. A DNA sequence according to Claim 1, wherein said patched gene is a 
mammalian gene. 

10 3. A DNA sequence according to Claim 1 for human, mouse, mosquito, butterfly 
or beetle patched gene. 

4. A DNA sequence according to Claim 3, wherein said DNA sequence is a 
human sequence. 

15 

5. A DNA sequence according to Claim 4, wherein said DNA sequence is a 
mouse sequence. 

6. A DNA sequence according to Claim 1, wherein said DNA sequence is a 
20 fragment of at least about 1 8bp. 

7. A DNA sequence according to Claim 1 joined to a DNA sequence comprising 
a restriction enzyme recognition sequence. 

25 8. An expression cassette comprising a transcriptional initiation region functional 
in an expression host, a DNA sequence according to Claim 1 under the 
transcriptional regulation of said transcriptional initiation region, and a 
transcriptional termination region functional in said expression host 

30 9. An expression cassette according t Claim 8, wherein said transcriptional 
initiation region is heterologous to said DNA sequence according to Claim 1. 
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10. An xpressi n cassette according to Claim 8, wherein said transcriptional 
initiation regi n is homologous to said DNA sequence according to Claim 1 and 
includes the enhancer region. 

5 11. A ceU comprising an expression cassette according to Claim 8 as part of an 
extrachromosomal element or integrated into the genome of a host cell as a result of 
introduction of said expression cassette into said host ceU and the cellular progeny of 
said host cell. 

10 12. A ceU according to Claim 1 1 , further comprising the patched protein in the 
cellular membrane of said cell. 

13. A cell according to Claim 1 1 , wherein said patched protein is a mouse patched 
protein. 



15 



14. A cell according to Claim 1 1 , wherein said patched gene is a human patched 
protdn. 



15. A cell according to Claim 1 1 , wherein said transcriptional initiation region is a 
Drosophila patched gene transcriptional initiation region comprising the promoter 
and enhancer joined to a heterologous gene. 

16. A cell comprising an expression cassette comprising a transcriptional initiation 
region functional in an expression host, said transcriptional initiation region 
consisting of a 5' non-coding region regulating the transcription of patched protein 
comprising the promoter and enhancer, a marker gene, and a transcriptional 
termination region, as part of an extrachromosomal element or integrated into die 
genome of a host ceU as a result of introduction of said expression cassette into said 
host, and the cellular progeny thereof. 
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17. A ceU according to Claim 16, wherein said transcriptional initiation region is 
the Drosopfiila region. 

18. A method for following embryonic development employing the patched 
5 protein in an embryo, said method comprising: 

integrating an expression cassette comprising a transcriptional initiation region 
functional in embryonic host cells, said transcriptional initiation region consisting of 
a 5- non-coding region regulating the transcription of patched protein, a marker 
gene, and a transcriptional termination region, wherein said embryonic host cells are 

10 capable of developing into a fetus; 

growing said embryonic host cells, whereby proliferation and differentiation 

occur; and 

locating ceUs comprising expression of thcpatched protein by means of 
expression of said marker gene. 



15 



19. A method for producing patched protein, said method comprising: 
growing a ceU according to Claim 11, whereby said patched protein is 

expressed; and 

isolating said patched protein free of oUier proteins. 



20 



20. A method for screening candidate compounds for binding affmity to the 
patched protein, said method comprising: 

combining said candidate protein with a vertebrate or invertebrate ceU 
comprising said patched protein in the membrane of said cell and an expression 
25 cassette comprising a transcriptional initiation region functional in said ceU. a DNA 
sequence according to Claim 1 comprising \ht entire coding sequence under the 
transcriptional regulation of said transcriptional initiation region, and a 
transcriptional termination region functional in said cell, expressing sidd patched 

protein in said cell; and 
30 assaying for the binding of said candidate compound to said patched protein. 
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21. A method for screening candidate compounds for agonist activity with the 
patched protein, said method comprising: 

combining said candidate protein with a vertebrate or invertebrate cell 
comprising said patched protein in the membrane of said cell and an expression 
5 cassette comprising a transcriptional initiation region functional in an expression 
host, said transcriptional initiation region consisting of a 5' non-coding region 
regulating the transcription of patched protein, a marker gene, and a transcriptional 
termination region, as part of an extrachromosomal element or integrated into the 
genome of a host cell; and 
10 assaying for the expression of said marker gene. 

22. A monoclonal antibody binding specifically to a patched protein, other than 
the Drosophila patched protein. 
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